Search results for: graph databases
1075 A NoSQL Based Approach for Real-Time Managing of Robotics's Data
Authors: Gueidi Afef, Gharsellaoui Hamza, Ben Ahmed Samir
Abstract:
This paper deals with the secret of the continual progression data that new data management solutions have been emerged: The NoSQL databases. They crossed several areas like personalization, profile management, big data in real-time, content management, catalog, view of customers, mobile applications, internet of things, digital communication and fraud detection. Nowadays, these database management systems are increasing. These systems store data very well and with the trend of big data, a new challenge’s store demands new structures and methods for managing enterprise data. The new intelligent machine in the e-learning sector, thrives on more data, so smart machines can learn more and faster. The robotics are our use case to focus on our test. The implementation of NoSQL for Robotics wrestle all the data they acquire into usable form because with the ordinary type of robotics; we are facing very big limits to manage and find the exact information in real-time. Our original proposed approach was demonstrated by experimental studies and running example used as a use case.Keywords: NoSQL databases, database management systems, robotics, big data
Procedia PDF Downloads 3561074 Real-Time Network Anomaly Detection Systems Based on Machine-Learning Algorithms
Authors: Zahra Ramezanpanah, Joachim Carvallo, Aurelien Rodriguez
Abstract:
This paper aims to detect anomalies in streaming data using machine learning algorithms. In this regard, we designed two separate pipelines and evaluated the effectiveness of each separately. The first pipeline, based on supervised machine learning methods, consists of two phases. In the first phase, we trained several supervised models using the UNSW-NB15 data-set. We measured the efficiency of each using different performance metrics and selected the best model for the second phase. At the beginning of the second phase, we first, using Argus Server, sniffed a local area network. Several types of attacks were simulated and then sent the sniffed data to a running algorithm at short intervals. This algorithm can display the results of each packet of received data in real-time using the trained model. The second pipeline presented in this paper is based on unsupervised algorithms, in which a Temporal Graph Network (TGN) is used to monitor a local network. The TGN is trained to predict the probability of future states of the network based on its past behavior. Our contribution in this section is introducing an indicator to identify anomalies from these predicted probabilities.Keywords: temporal graph network, anomaly detection, cyber security, IDS
Procedia PDF Downloads 1031073 Row Detection and Graph-Based Localization in Tree Nurseries Using a 3D LiDAR
Authors: Ionut Vintu, Stefan Laible, Ruth Schulz
Abstract:
Agricultural robotics has been developing steadily over recent years, with the goal of reducing and even eliminating pesticides used in crops and to increase productivity by taking over human labor. The majority of crops are arranged in rows. The first step towards autonomous robots, capable of driving in fields and performing crop-handling tasks, is for robots to robustly detect the rows of plants. Recent work done towards autonomous driving between plant rows offers big robotic platforms equipped with various expensive sensors as a solution to this problem. These platforms need to be driven over the rows of plants. This approach lacks flexibility and scalability when it comes to the height of plants or distance between rows. This paper proposes instead an algorithm that makes use of cheaper sensors and has a higher variability. The main application is in tree nurseries. Here, plant height can range from a few centimeters to a few meters. Moreover, trees are often removed, leading to gaps within the plant rows. The core idea is to combine row detection algorithms with graph-based localization methods as they are used in SLAM. Nodes in the graph represent the estimated pose of the robot, and the edges embed constraints between these poses or between the robot and certain landmarks. This setup aims to improve individual plant detection and deal with exception handling, like row gaps, which are falsely detected as an end of rows. Four methods were developed for detecting row structures in the fields, all using a point cloud acquired with a 3D LiDAR as an input. Comparing the field coverage and number of damaged plants, the method that uses a local map around the robot proved to perform the best, with 68% covered rows and 25% damaged plants. This method is further used and combined with a graph-based localization algorithm, which uses the local map features to estimate the robot’s position inside the greater field. Testing the upgraded algorithm in a variety of simulated fields shows that the additional information obtained from localization provides a boost in performance over methods that rely purely on perception to navigate. The final algorithm achieved a row coverage of 80% and an accuracy of 27% damaged plants. Future work would focus on achieving a perfect score of 100% covered rows and 0% damaged plants. The main challenges that the algorithm needs to overcome are fields where the height of the plants is too small for the plants to be detected and fields where it is hard to distinguish between individual plants when they are overlapping. The method was also tested on a real robot in a small field with artificial plants. The tests were performed using a small robot platform equipped with wheel encoders, an IMU and an FX10 3D LiDAR. Over ten runs, the system achieved 100% coverage and 0% damaged plants. The framework built within the scope of this work can be further used to integrate data from additional sensors, with the goal of achieving even better results.Keywords: 3D LiDAR, agricultural robots, graph-based localization, row detection
Procedia PDF Downloads 1401072 Ontology-Based Backpropagation Neural Network Classification and Reasoning Strategy for NoSQL and SQL Databases
Authors: Hao-Hsiang Ku, Ching-Ho Chi
Abstract:
Big data applications have become an imperative for many fields. Many researchers have been devoted into increasing correct rates and reducing time complexities. Hence, the study designs and proposes an Ontology-based backpropagation neural network classification and reasoning strategy for NoSQL big data applications, which is called ON4NoSQL. ON4NoSQL is responsible for enhancing the performances of classifications in NoSQL and SQL databases to build up mass behavior models. Mass behavior models are made by MapReduce techniques and Hadoop distributed file system based on Hadoop service platform. The reference engine of ON4NoSQL is the ontology-based backpropagation neural network classification and reasoning strategy. Simulation results indicate that ON4NoSQL can efficiently achieve to construct a high performance environment for data storing, searching, and retrieving.Keywords: Hadoop, NoSQL, ontology, back propagation neural network, high distributed file system
Procedia PDF Downloads 2621071 Developing a Deep Understanding of the Immune Response in Hepatitis B Virus Infected Patients Using a Knowledge Driven Approach
Authors: Hanan Begali, Shahi Dost, Annett Ziegler, Markus Cornberg, Maria-Esther Vidal, Anke R. M. Kraft
Abstract:
Chronic hepatitis B virus (HBV) infection can be treated with nucleot(s)ide analog (NA), for example, which inhibits HBV replication. However, they have hardly any influence on the functional cure of HBV, which is defined by hepatitis B surface antigen (HBsAg) loss. NA needs to be taken life-long, which is not available for all patients worldwide. Additionally, NA-treated patients are still at risk of developing cirrhosis, liver failure, or hepatocellular carcinoma (HCC). Although each patient has the same components of the immune system, immune responses vary between patients. Therefore, a deeper understanding of the immune response against HBV in different patients is necessary to understand the parameters leading to HBV cure and to use this knowledge to optimize HBV therapies. This requires seamless integration of an enormous amount of diverse and fine-grained data from viral markers, e.g., hepatitis B core-related antigen (HBcrAg) and hepatitis B surface antigen (HBsAg). The data integration system relies on the assumption that profiling human immune systems requires the analysis of various variables (e.g., demographic data, treatments, pre-existing conditions, immune cell response, or HLA-typing) rather than only one. However, the values of these variables are collected independently. They are presented in a myriad of formats, e.g., excel files, textual descriptions, lab book notes, and images of flow cytometry dot plots. Additionally, patients can be identified differently in these analyses. This heterogeneity complicates the integration of variables, as data management techniques are needed to create a unified view in which individual formats and identifiers are transparent when profiling the human immune systems. The proposed study (HBsRE) aims at integrating heterogeneous data sets of 87 chronically HBV-infected patients, e.g., clinical data, immune cell response, and HLA-typing, with knowledge encoded in biomedical ontologies and open-source databases into a knowledge-driven framework. This new technique enables us to harmonize and standardize heterogeneous datasets in the defined modeling of the data integration system, which will be evaluated in the knowledge graph (KG). KGs are data structures that represent the knowledge and data as factual statements using a graph data model. Finally, the analytic data model will be applied on top of KG in order to develop a deeper understanding of the immune profiles among various patients and to evaluate factors playing a role in a holistic profile of patients with HBsAg level loss. Additionally, our objective is to utilize this unified approach to stratify patients for new effective treatments. This study is developed in the context of the project “Transforming big data into knowledge: for deep immune profiling in vaccination, infectious diseases, and transplantation (ImProVIT)”, which is a multidisciplinary team composed of computer scientists, infection biologists, and immunologists.Keywords: chronic hepatitis B infection, immune response, knowledge graphs, ontology
Procedia PDF Downloads 1081070 A Study on the Computation of Gourava Indices for Poly-L Lysine Dendrimer and Its Biomedical Applications
Authors: M. Helen
Abstract:
Chemical graph serves as a convenient model for any real or abstract chemical system. Dendrimers are novel three dimensional hyper branched globular nanopolymeric architectures. Drug delivery scientists are especially enthusiastic about possible utility of dendrimers as drug delivery tool. Dendrimers like poly L lysine (PLL), poly-propylene imine (PPI) and poly-amidoamine (PAMAM), etc., are used as gene carrier in drug delivery system because of their chemical characteristics. These characteristics of chemical compounds are analysed using topological indices (invariants under graph isomorphism) such as Wiener index, Zagreb index, etc., Prof. V. R. Kulli motivated by the application of Zagreb indices in finding the total π energy and derived Gourava indices which is an improved version over Zagreb indices. In this paper, we study the structure of PLL-Dendrimer that has the following applications: reduction in toxicity, colon delivery, and topical delivery. Also, we determine first and second Gourava indices, first and second hyper Gourava indices, product and sum connectivity Gourava indices for PLL-Dendrimer. Gourava Indices have found applications in Quantitative Structure-Property Relationship (QSPR)/ Quantitative Structure-Activity Relationship (QSAR) studies.Keywords: connectivity Gourava indices, dendrimer, Gourava indices, hyper GouravaG indices
Procedia PDF Downloads 1401069 Toward an Understanding of the Neurofunctional Dissociation between Animal and Tool Concepts: A Graph Theoretical Analysis
Authors: Skiker Kaoutar, Mounir Maouene
Abstract:
Neuroimaging studies have shown that animal and tool concepts rely on distinct networks of brain areas. Animal concepts depend predominantly on temporal areas while tool concepts rely on fronto-temporo-parietal areas. However, the origin of this neurofunctional distinction for processing animal and tool concepts remains still unclear. Here, we address this question from a network perspective suggesting that the neural distinction between animals and tools might reflect the differences in their structural semantic networks. We build semantic networks for animal and tool concepts derived from Mc Rae and colleagues’s behavioral study conducted on a large number of participants. These two networks are thus analyzed through a large number of graph theoretical measures for small-worldness: centrality, clustering coefficient, average shortest path length, as well as resistance to random and targeted attacks. The results indicate that both animal and tool networks have small-world properties. More importantly, the animal network is more vulnerable to targeted attacks compared to the tool network a result that correlates with brain lesions studies.Keywords: animals, tools, network, semantics, small-world, resilience to damage
Procedia PDF Downloads 5491068 Accessibility and Visibility through Space Syntax Analysis of the Linga Raj Temple in Odisha, India
Authors: S. Pramanik
Abstract:
Since the early ages, the Hindu temples have been interpreted through various Vedic philosophies. These temples are visited by pilgrims which demonstrate the rituals and religious belief of communities, reflecting a variety of actions and behaviors. Darsana— a direct seeing, is a part of the pilgrimage activity. During the process of Darsana, a devotee is prepared for entry in the temple to realize the cognizing Truth culminating in visualizing the idol of God, placed at the Garbhagriha (sanctum sanctorum). For this, the pilgrim must pass through a sequential arrangement of spaces. During the process of progress, the pilgrims visualize the spaces differently from various points of views. The viewpoints create a variety of spatial patterns in the minds of pilgrims coherent to the Hindu philosophies. The space organization and its order are perceived by various techniques of spatial analysis. A temple, as examples of Kalinga stylistic variations, has been chosen for the study. This paper intends to demonstrate some visual patterns generated during the process of Darsana (visibility) and its accessibility by Point Isovist Studies and Visibility Graph Analysis from the entrance (Simha Dwara) to The Sanctum sanctorum (Garbhagriha).Keywords: Hindu temple architecture, point isovist, space syntax analysis, visibility graph analysis
Procedia PDF Downloads 1211067 Anomaly Detection with ANN and SVM for Telemedicine Networks
Authors: Edward Guillén, Jeisson Sánchez, Carlos Omar Ramos
Abstract:
In recent years, a wide variety of applications are developed with Support Vector Machines -SVM- methods and Artificial Neural Networks -ANN-. In general, these methods depend on intrusion knowledge databases such as KDD99, ISCX, and CAIDA among others. New classes of detectors are generated by machine learning techniques, trained and tested over network databases. Thereafter, detectors are employed to detect anomalies in network communication scenarios according to user’s connections behavior. The first detector based on training dataset is deployed in different real-world networks with mobile and non-mobile devices to analyze the performance and accuracy over static detection. The vulnerabilities are based on previous work in telemedicine apps that were developed on the research group. This paper presents the differences on detections results between some network scenarios by applying traditional detectors deployed with artificial neural networks and support vector machines.Keywords: anomaly detection, back-propagation neural networks, network intrusion detection systems, support vector machines
Procedia PDF Downloads 3591066 Optimizing the Location of Parking Areas Adapted for Dangerous Goods in the European Road Transport Network
Authors: María Dolores Caro, Eugenio M. Fedriani, Ángel F. Tenorio
Abstract:
The transportation of dangerous goods by lorries throughout Europe must be done by using the roads conforming the European Road Transport Network. In this network, there are several parking areas where lorry drivers can park to rest according to the regulations. According to the "European Agreement concerning the International Carriage of Dangerous Goods by Road", parking areas where lorries transporting dangerous goods can park to rest, must follow several security stipulations to keep safe the rest of road users. At this respect, these lorries must be parked in adapted areas with strict and permanent surveillance measures. Moreover, drivers must satisfy several restrictions about resting and driving time. Under these facts, one may expect that there exist enough parking areas for the transport of this type of goods in order to obey the regulations prescribed by the European Union and its member countries. However, the already-existing parking areas are not sufficient to cover all the stops required by drivers transporting dangerous goods. Our main goal is, starting from the already-existing parking areas and the loading-and-unloading location, to provide an optimal answer to the following question: how many additional parking areas must be built and where must they be located to assure that lorry drivers can transport dangerous goods following all the stipulations about security and safety for their stops? The sense of the word “optimal” is due to the fact that we give a global solution for the location of parking areas throughout the whole European Road Transport Network, adjusting the number of additional areas to be as lower as possible. To do so, we have modeled the problem using graph theory since we are working with a road network. As nodes, we have considered the locations of each already-existing parking area, each loading-and-unloading area each road bifurcation. Each road connecting two nodes is considered as an edge in the graph whose weight corresponds to the distance between both nodes in the edge. By applying a new efficient algorithm, we have found the additional nodes for the network representing the new parking areas adapted for dangerous goods, under the fact that the distance between two parking areas must be less than or equal to 400 km.Keywords: trans-european transport network, dangerous goods, parking areas, graph-based modeling
Procedia PDF Downloads 2811065 Optimization of Feeder Bus Routes at Urban Rail Transit Stations Based on Link Growth Probability
Authors: Yu Song, Yuefei Jin
Abstract:
Urban public transportation can be integrated when there is an efficient connection between urban rail lines, however, there are currently no effective or quick solutions being investigated for this connection. This paper analyzes the space-time distribution and travel demand of passenger connection travel based on taxi track data and data from the road network, excavates potential bus connection stations based on potential connection demand data, and introduces the link growth probability model in the complex network to solve the basic connection bus lines in order to ascertain the direction of the bus lines that are the most connected given the demand characteristics. Then, a tree view exhaustive approach based on constraints is suggested based on graph theory, which can hasten the convergence of findings while doing chain calculations. This study uses WEI QU NAN Station, the Xi'an Metro Line 2 terminal station in Shaanxi Province, as an illustration, to evaluate the model's and the solution method's efficacy. According to the findings, 153 prospective stations have been dug up in total, the feeder bus network for the entire line has been laid out, and the best route adjustment strategy has been found.Keywords: feeder bus, route optimization, link growth probability, the graph theory
Procedia PDF Downloads 781064 Quantifying Multivariate Spatiotemporal Dynamics of Malaria Risk Using Graph-Based Optimization in Southern Ethiopia
Authors: Yonas Shuke Kitawa
Abstract:
Background: Although malaria incidence has substantially fallen sharply over the past few years, the rate of decline varies by district, time, and malaria type. Despite this turn-down, malaria remains a major public health threat in various districts of Ethiopia. Consequently, the present study is aimed at developing a predictive model that helps to identify the spatio-temporal variation in malaria risk by multiple plasmodium species. Methods: We propose a multivariate spatio-temporal Bayesian model to obtain a more coherent picture of the temporally varying spatial variation in disease risk. The spatial autocorrelation in such a data set is typically modeled by a set of random effects that assign a conditional autoregressive prior distribution. However, the autocorrelation considered in such cases depends on a binary neighborhood matrix specified through the border-sharing rule. Over here, we propose a graph-based optimization algorithm for estimating the neighborhood matrix that merely represents the spatial correlation by exploring the areal units as the vertices of a graph and the neighbor relations as the series of edges. Furthermore, we used aggregated malaria count in southern Ethiopia from August 2013 to May 2019. Results: We recognized that precipitation, temperature, and humidity are positively associated with the malaria threat in the area. On the other hand, enhanced vegetation index, nighttime light (NTL), and distance from coastal areas are negatively associated. Moreover, nonlinear relationships were observed between malaria incidence and precipitation, temperature, and NTL. Additionally, lagged effects of temperature and humidity have a significant effect on malaria risk by either species. More elevated risk of P. falciparum was observed following the rainy season, and unstable transmission of P. vivax was observed in the area. Finally, P. vivax risks are less sensitive to environmental factors than those of P. falciparum. Conclusion: The improved inference was gained by employing the proposed approach in comparison to the commonly used border-sharing rule. Additionally, different covariates are identified, including delayed effects, and elevated risks of either of the cases were observed in districts found in the central and western regions. As malaria transmission operates in a spatially continuous manner, a spatially continuous model should be employed when it is computationally feasible.Keywords: disease mapping, MSTCAR, graph-based optimization algorithm, P. falciparum, P. vivax, waiting matrix
Procedia PDF Downloads 821063 Visualizing the Commercial Activity of a City by Analyzing the Data Information in Layers
Authors: Taras Agryzkov, Jose L. Oliver, Leandro Tortosa, Jose Vicent
Abstract:
This paper aims to demonstrate how network models can be used to understand and to deal with some aspects of urban complexity. As it is well known, the Theory of Architecture and Urbanism has been using for decades’ intellectual tools based on the ‘sciences of complexity’ as a strategy to propose theoretical approaches about cities and about architecture. In this sense, it is possible to find a vast literature in which for instance network theory is used as an instrument to understand very diverse questions about cities: from their commercial activity to their heritage condition. The contribution of this research consists in adding one step of complexity to this process: instead of working with one single primal graph as it is usually done, we will show how new network models arise from the consideration of two different primal graphs interacting in two layers. When we model an urban network through a mathematical structure like a graph, the city is usually represented by a set of nodes and edges that reproduce its topology, with the data generated or extracted from the city embedded in it. All this information is normally displayed in a single layer. Here, we propose to separate the information in two layers so that we can evaluate the interaction between them. Besides, both layers may be composed of structures that do not have to coincide: from this bi-layer system, groups of interactions emerge, suggesting reflections and in consequence, possible actions.Keywords: graphs, mathematics, networks, urban studies
Procedia PDF Downloads 1841062 Parameter Estimation for Contact Tracing in Graph-Based Models
Authors: Augustine Okolie, Johannes Müller, Mirjam Kretzchmar
Abstract:
We adopt a maximum-likelihood framework to estimate parameters of a stochastic susceptible-infected-recovered (SIR) model with contact tracing on a rooted random tree. Given the number of detectees per index case, our estimator allows to determine the degree distribution of the random tree as well as the tracing probability. Since we do not discover all infectees via contact tracing, this estimation is non-trivial. To keep things simple and stable, we develop an approximation suited for realistic situations (contract tracing probability small, or the probability for the detection of index cases small). In this approximation, the only epidemiological parameter entering the estimator is the basic reproduction number R0. The estimator is tested in a simulation study and applied to covid-19 contact tracing data from India. The simulation study underlines the efficiency of the method. For the empirical covid-19 data, we are able to compare different degree distributions and perform a sensitivity analysis. We find that particularly a power-law and a negative binomial degree distribution meet the data well and that the tracing probability is rather large. The sensitivity analysis shows no strong dependency on the reproduction number.Keywords: stochastic SIR model on graph, contact tracing, branching process, parameter inference
Procedia PDF Downloads 781061 Language Development and Growing Spanning Trees in Children Semantic Network
Authors: Somayeh Sadat Hashemi Kamangar, Fatemeh Bakouie, Shahriar Gharibzadeh
Abstract:
In this study, we target to exploit Maximum Spanning Trees (MST) of children's semantic networks to investigate their language development. To do so, we examine the graph-theoretic properties of word-embedding networks. The networks are made of words children learn prior to the age of 30 months as the nodes and the links which are built from the cosine vector similarity of words normatively acquired by children prior to two and a half years of age. These networks are weighted graphs and the strength of each link is determined by the numerical similarities of the two words (nodes) on the sides of the link. To avoid changing the weighted networks to the binaries by setting a threshold, constructing MSTs might present a solution. MST is a unique sub-graph that connects all the nodes in such a way that the sum of all the link weights is maximized without forming cycles. MSTs as the backbone of the semantic networks are suitable to examine developmental changes in semantic network topology in children. From these trees, several parameters were calculated to characterize the developmental change in network organization. We showed that MSTs provides an elegant method sensitive to capture subtle developmental changes in semantic network organization.Keywords: maximum spanning trees, word-embedding, semantic networks, language development
Procedia PDF Downloads 1471060 Factors Influencing Respectful Perinatal Care Among Healthcare Professionals In Low-and Middle-resource Countries: A Systematic Review
Authors: Petronella Lunda, Catharina Susanna Minnie, Welma Lubbe
Abstract:
Background This review aimed to provide healthcare professionals with a scientific summary of the best available research evidence on factors influencing respectful perinatal care. The review question was ‘What were the perceptions of midwives and doctors on factors that influence respectful perinatal care?’ Methods A detailed search was done on electronic databases: EBSCOhost: Medline, OAlster, Scopus, SciELO, Science Direct, PubMed, Psych INFO, and SocINDEX. The databases were searched for available literature using a predetermined search strategy. Reference lists of included studies were analysed to identify studies missing from databases. The phenomenon of interest was factors influencing maternity care practices according to midwives and doctors. Pre-determined inclusion and exclusion criteria were used during the selection of potential studies. In total, 13 studies were included in the data analysis and synthesis. Three themes were identified and a total of nine sub-themes. Results Studies conducted in various settings were included in the study. Multiple factors influencing respectful perinatal care were identified. During data synthesis, three themes emerged: healthcare institution, healthcare professionals, and women-related factors. Alongside the themes were sub-themes human resources, medical supplies, norms and practices, physical infrastructure, healthcare professional competencies and attributes, women’s knowledge, and preferences. The three factors influence the provision of respectful perinatal care; addressing them might improve the provision of the care. Conclusion Addressing factors that influence respectful perinatal care is vital towards the prevention of compromised patient care during the perinatal period as these factors have the potential to accelerate or hinder provision of respectful care.Keywords: doctors, maternity care, midwives, obstetrician, perceptions, perinatal care, respectful care
Procedia PDF Downloads 261059 Game Structure and Spatio-Temporal Action Detection in Soccer Using Graphs and 3D Convolutional Networks
Authors: Jérémie Ochin
Abstract:
Soccer analytics are built on two data sources: the frame-by-frame position of each player on the terrain and the sequences of events, such as ball drive, pass, cross, shot, throw-in... With more than 2000 ball-events per soccer game, their precise and exhaustive annotation, based on a monocular video stream such as a TV broadcast, remains a tedious and costly manual task. State-of-the-art methods for spatio-temporal action detection from a monocular video stream, often based on 3D convolutional neural networks, are close to reach levels of performances in mean Average Precision (mAP) compatibles with the automation of such task. Nevertheless, to meet their expectation of exhaustiveness in the context of data analytics, such methods must be applied in a regime of high recall – low precision, using low confidence score thresholds. This setting unavoidably leads to the detection of false positives that are the product of the well documented overconfidence behaviour of neural networks and, in this case, their limited access to contextual information and understanding of the game: their predictions are highly unstructured. Based on the assumption that professional soccer players’ behaviour, pose, positions and velocity are highly interrelated and locally driven by the player performing a ball-action, it is hypothesized that the addition of information regarding surrounding player’s appearance, positions and velocity in the prediction methods can improve their metrics. Several methods are compared to build a proper representation of the game surrounding a player, from handcrafted features of the local graph, based on domain knowledge, to the use of Graph Neural Networks trained in an end-to-end fashion with existing state-of-the-art 3D convolutional neural networks. It is shown that the inclusion of information regarding surrounding players helps reaching higher metrics.Keywords: fine-grained action recognition, human action recognition, convolutional neural networks, graph neural networks, spatio-temporal action recognition
Procedia PDF Downloads 291058 High-Fidelity Materials Screening with a Multi-Fidelity Graph Neural Network and Semi-Supervised Learning
Authors: Akeel A. Shah, Tong Zhang
Abstract:
Computational approaches to learning the properties of materials are commonplace, motivated by the need to screen or design materials for a given application, e.g., semiconductors and energy storage. Experimental approaches can be both time consuming and costly. Unfortunately, computational approaches such as ab-initio electronic structure calculations and classical or ab-initio molecular dynamics are themselves can be too slow for the rapid evaluation of materials, often involving thousands to hundreds of thousands of candidates. Machine learning assisted approaches have been developed to overcome the time limitations of purely physics-based approaches. These approaches, on the other hand, require large volumes of data for training (hundreds of thousands on many standard data sets such as QM7b). This means that they are limited by how quickly such a large data set of physics-based simulations can be established. At high fidelity, such as configuration interaction, composite methods such as G4, and coupled cluster theory, gathering such a large data set can become infeasible, which can compromise the accuracy of the predictions - many applications require high accuracy, for example band structures and energy levels in semiconductor materials and the energetics of charge transfer in energy storage materials. In order to circumvent this problem, multi-fidelity approaches can be adopted, for example the Δ-ML method, which learns a high-fidelity output from a low-fidelity result such as Hartree-Fock or density functional theory (DFT). The general strategy is to learn a map between the low and high fidelity outputs, so that the high-fidelity output is obtained a simple sum of the physics-based low-fidelity and correction, Although this requires a low-fidelity calculation, it typically requires far fewer high-fidelity results to learn the correction map, and furthermore, the low-fidelity result, such as Hartree-Fock or semi-empirical ZINDO, is typically quick to obtain, For high-fidelity outputs the result can be an order of magnitude or more in speed up. In this work, a new multi-fidelity approach is developed, based on a graph convolutional network (GCN) combined with semi-supervised learning. The GCN allows for the material or molecule to be represented as a graph, which is known to improve accuracy, for example SchNet and MEGNET. The graph incorporates information regarding the numbers of, types and properties of atoms; the types of bonds; and bond angles. They key to the accuracy in multi-fidelity methods, however, is the incorporation of low-fidelity output to learn the high-fidelity equivalent, in this case by learning their difference. Semi-supervised learning is employed to allow for different numbers of low and high-fidelity training points, by using an additional GCN-based low-fidelity map to predict high fidelity outputs. It is shown on 4 different data sets that a significant (at least one order of magnitude) increase in accuracy is obtained, using one to two orders of magnitude fewer low and high fidelity training points. One of the data sets is developed in this work, pertaining to 1000 simulations of quinone molecules (up to 24 atoms) at 5 different levels of fidelity, furnishing the energy, dipole moment and HOMO/LUMO.Keywords: .materials screening, computational materials, machine learning, multi-fidelity, graph convolutional network, semi-supervised learning
Procedia PDF Downloads 421057 Topological Language for Classifying Linear Chord Diagrams via Intersection Graphs
Authors: Michela Quadrini
Abstract:
Chord diagrams occur in mathematics, from the study of RNA to knot theory. They are widely used in theory of knots and links for studying the finite type invariants, whereas in molecular biology one important motivation to study chord diagrams is to deal with the problem of RNA structure prediction. An RNA molecule is a linear polymer, referred to as the backbone, that consists of four types of nucleotides. Each nucleotide is represented by a point, whereas each chord of the diagram stands for one interaction for Watson-Crick base pairs between two nonconsecutive nucleotides. A chord diagram is an oriented circle with a set of n pairs of distinct points, considered up to orientation preserving diffeomorphisms of the circle. A linear chord diagram (LCD) is a special kind of graph obtained cutting the oriented circle of a chord diagram. It consists of a line segment, called its backbone, to which are attached a number of chords with distinct endpoints. There is a natural fattening on any linear chord diagram; the backbone lies on the real axis, while all the chords are in the upper half-plane. Each linear chord diagram has a natural genus of its associated surface. To each chord diagram and linear chord diagram, it is possible to associate the intersection graph. It consists of a graph whose vertices correspond to the chords of the diagram, whereas the chord intersections are represented by a connection between the vertices. Such intersection graph carries a lot of information about the diagram. Our goal is to define an LCD equivalence class in terms of identity of intersection graphs, from which many chord diagram invariants depend. For studying these invariants, we introduce a new representation of Linear Chord Diagrams based on a set of appropriate topological operators that permits to model LCD in terms of the relations among chords. Such set is composed of: crossing, nesting, and concatenations. The crossing operator is able to generate the whole space of linear chord diagrams, and a multiple context free grammar able to uniquely generate each LDC starting from a linear chord diagram adding a chord for each production of the grammar is defined. In other words, it allows to associate a unique algebraic term to each linear chord diagram, while the remaining operators allow to rewrite the term throughout a set of appropriate rewriting rules. Such rules define an LCD equivalence class in terms of the identity of intersection graphs. Starting from a modelled RNA molecule and the linear chord, some authors proposed a topological classification and folding. Our LCD equivalence class could contribute to the RNA folding problem leading to the definition of an algorithm that calculates the free energy of the molecule more accurately respect to the existing ones. Such LCD equivalence class could be useful to obtain a more accurate estimate of link between the crossing number and the topological genus and to study the relation among other invariants.Keywords: chord diagrams, linear chord diagram, equivalence class, topological language
Procedia PDF Downloads 2031056 Comparison of Unit Hydrograph Models to Simulate Flood Events at the Field Scale
Authors: Imene Skhakhfa, Lahbaci Ouerdachi
Abstract:
To ensure the overall coherence of simulated results, it is necessary to develop a robust validation process. In many applications, it is no longer content to calibrate and validate the model only in relation to the hydro graph measured at the outlet, but we try to better simulate the functioning of the watershed in space. Therefore the timing also performs compared to other variables such as water level measurements in intermediate stations or groundwater levels. As part of this work, we limit ourselves to modeling flood of short duration for which the process of evapotranspiration is negligible. The main parameters to identify the models are related to the method of unit hydro graph (HU). Three different models were tested: SNYDER, CLARK and SCS. These models differ in their mathematical structure and parameters to be calibrated while hydrological data are the same, the initial water content and precipitation. The models are compared on the basis of their performance in terms six objective criteria, three global criteria and three criteria representing volume, peak flow, and the mean square error. The first type of criteria gives more weight to strong events whereas the second considers all events to be of equal weight. The results show that the calibrated parameter values are dependent and also highlight the problems associated with the simulation of low flow events and intermittent precipitation.Keywords: model calibration, intensity, runoff, hydrograph
Procedia PDF Downloads 4861055 An Insite to the Probabilistic Assessment of Reserves in Conventional Reservoirs
Authors: Sai Sudarshan, Harsh Vyas, Riddhiman Sherlekar
Abstract:
The oil and gas industry has been unwilling to adopt stochastic definition of reserves. Nevertheless, Monte Carlo simulation methods have gained acceptance by engineers, geoscientists and other professionals who want to evaluate prospects or otherwise analyze problems that involve uncertainty. One of the common applications of Monte Carlo simulation is the estimation of recoverable hydrocarbon from a reservoir.Monte Carlo Simulation makes use of random samples of parameters or inputs to explore the behavior of a complex system or process. It finds application whenever one needs to make an estimate, forecast or decision where there is significant uncertainty. First, the project focuses on performing Monte-Carlo Simulation on a given data set using U. S Department of Energy’s MonteCarlo Software, which is a freeware e&p tool. Further, an algorithm for simulation has been developed for MATLAB and program performs simulation by prompting user for input distributions and parameters associated with each distribution (i.e. mean, st.dev, min., max., most likely, etc.). It also prompts user for desired probability for which reserves are to be calculated. The algorithm so developed and tested in MATLAB further finds implementation in Python where existing libraries on statistics and graph plotting have been imported to generate better outcome. With PyQt designer, codes for a simple graphical user interface have also been written. The graph so plotted is then validated with already available results from U.S DOE MonteCarlo Software.Keywords: simulation, probability, confidence interval, sensitivity analysis
Procedia PDF Downloads 3831054 Content-Based Mammograms Retrieval Based on Breast Density Criteria Using Bidimensional Empirical Mode Decomposition
Authors: Sourour Khouaja, Hejer Jlassi, Nadia Feddaoui, Kamel Hamrouni
Abstract:
Most medical images, and especially mammographies, are now stored in large databases. Retrieving a desired image is considered of great importance in order to find previous similar cases diagnosis. Our method is implemented to assist radiologists in retrieving mammographic images containing breast with similar density aspect as seen on the mammogram. This is becoming a challenge seeing the importance of density criteria in cancer provision and its effect on segmentation issues. We used the BEMD (Bidimensional Empirical Mode Decomposition) to characterize the content of images and Euclidean distance measure similarity between images. Through the experiments on the MIAS mammography image database, we confirm that the results are promising. The performance was evaluated using precision and recall curves comparing query and retrieved images. Computing recall-precision proved the effectiveness of applying the CBIR in the large mammographic image databases. We found a precision of 91.2% for mammography with a recall of 86.8%.Keywords: BEMD, breast density, contend-based, image retrieval, mammography
Procedia PDF Downloads 2341053 Domain specific Ontology-Based Knowledge Extraction Using R-GNN and Large Language Models
Authors: Andrey Khalov
Abstract:
The rapid proliferation of unstructured data in IT infrastructure management demands innovative approaches for extracting actionable knowledge. This paper presents a framework for ontology-based knowledge extraction that combines relational graph neural networks (R-GNN) with large language models (LLMs). The proposed method leverages the DOLCE framework as the foundational ontology, extending it with concepts from ITSMO for domain-specific applications in IT service management and outsourcing. A key component of this research is the use of transformer-based models, such as DeBERTa-v3-large, for automatic entity and relationship extraction from unstructured texts. Furthermore, the paper explores how transfer learning techniques can be applied to fine-tune large language models (LLaMA) for using to generate synthetic datasets to improve precision in BERT-based entity recognition and ontology alignment. The resulting IT Ontology (ITO) serves as a comprehensive knowledge base that integrates domain-specific insights from ITIL processes, enabling more efficient decision-making. Experimental results demonstrate significant improvements in knowledge extraction and relationship mapping, offering a cutting-edge solution for enhancing cognitive computing in IT service environments.Keywords: ontology mapping, R-GNN, knowledge extraction, large language models, NER, knowlege graph
Procedia PDF Downloads 191052 Examining Social Connectivity through Email Network Analysis: Study of Librarians' Emailing Groups in Pakistan
Authors: Muhammad Arif Khan, Haroon Idrees, Imran Aziz, Sidra Mushtaq
Abstract:
Social platforms like online discussion and mailing groups are well aligned with academic as well as professional learning spaces. Professional communities are increasingly moving to online forums for sharing and capturing the intellectual abilities. This study investigated dynamics of social connectivity of yahoo mailing groups of Pakistani Library and Information Science (LIS) professionals using Graph Theory technique. Design/Methodology: Social Network Analysis is the increasingly concerned domain for scientists in identifying whether people grow together through online social interaction or, whether they just reflect connectivity. We have conducted a longitudinal study using Network Graph Theory technique to analyze the large data-set of email communication. The data was collected from three yahoo mailing groups using network analysis software over a period of six months i.e. January to June 2016. Findings of the network analysis were reviewed through focus group discussion with LIS experts and selected respondents of the study. Data were analyzed in Microsoft Excel and network diagrams were visualized using NodeXL and ORA-Net Scene package. Findings: Findings demonstrate that professionals and students exhibit intellectual growth the more they get tied within a network by interacting and participating in communication through online forums. The study reports on dynamics of the large network by visualizing the email correspondence among group members in a network consisting vertices (members) and edges (randomized correspondence). The model pair wise relationship between group members was illustrated to show characteristics, reasons, and strength of ties. Connectivity of nodes illustrated the frequency of communication among group members through examining node coupling, diffusion of networks, and node clustering has been demonstrated in-depth. Network analysis was found to be a useful technique in investigating the dynamics of the large network.Keywords: emailing networks, network graph theory, online social platforms, yahoo mailing groups
Procedia PDF Downloads 2411051 Malware Beaconing Detection by Mining Large-scale DNS Logs for Targeted Attack Identification
Authors: Andrii Shalaginov, Katrin Franke, Xiongwei Huang
Abstract:
One of the leading problems in Cyber Security today is the emergence of targeted attacks conducted by adversaries with access to sophisticated tools. These attacks usually steal senior level employee system privileges, in order to gain unauthorized access to confidential knowledge and valuable intellectual property. Malware used for initial compromise of the systems are sophisticated and may target zero-day vulnerabilities. In this work we utilize common behaviour of malware called ”beacon”, which implies that infected hosts communicate to Command and Control servers at regular intervals that have relatively small time variations. By analysing such beacon activity through passive network monitoring, it is possible to detect potential malware infections. So, we focus on time gaps as indicators of possible C2 activity in targeted enterprise networks. We represent DNS log files as a graph, whose vertices are destination domains and edges are timestamps. Then by using four periodicity detection algorithms for each pair of internal-external communications, we check timestamp sequences to identify the beacon activities. Finally, based on the graph structure, we infer the existence of other infected hosts and malicious domains enrolled in the attack activities.Keywords: malware detection, network security, targeted attack, computational intelligence
Procedia PDF Downloads 2661050 The Malfatti’s Problem in Reuleaux Triangle
Authors: Ching-Shoei Chiang
Abstract:
The Malfatti’s Problem is to ask for fitting 3 circles into a right triangle such that they are tangent to each other, and each circle is also tangent to a pair of the triangle’s side. This problem has been extended to any triangle (called general Malfatti’s Problem). Furthermore, the problem has been extended to have 1+2+…+n circles, we call it extended general Malfatti’s problem, these circles whose tangency graph, using the center of circles as vertices and the edge connect two circles center if these two circles tangent to each other, has the structure as Pascal’s triangle, and the exterior circles of these circles tangent to three sides of the triangle. In the extended general Malfatti’s problem, there are closed-form solutions for n=1, 2, and the problem becomes complex when n is greater than 2. In solving extended general Malfatti’s problem (n>2), we initially give values to the radii of all circles. From the tangency graph and current radii, we can compute angle value between two vectors. These vectors are from the center of the circle to the tangency points with surrounding elements, and these surrounding elements can be the boundary of the triangle or other circles. For each circle C, there are vectors from its center c to its tangency point with its neighbors (count clockwise) pi, i=0, 1,2,..,n. We add all angles between cpi to cp(i+1) mod (n+1), i=0,1,..,n, call it sumangle(C) for circle C. Using sumangle(C), we can reduce/enlarge the radii for all circles in next iteration, until sumangle(C) is equal to 2πfor all circles. With a similar idea, this paper proposed an algorithm to find the radii of circles whose tangency has the structure of Pascal’s triangle, and the exterior circles of these circles are tangent to the unit Realeaux Triangle.Keywords: Malfatti’s problem, geometric constraint solver, computer-aided geometric design, circle packing, data visualization
Procedia PDF Downloads 1331049 Brain Tumor Segmentation Based on Minimum Spanning Tree
Authors: Simeon Mayala, Ida Herdlevær, Jonas Bull Haugsøen, Shamundeeswari Anandan, Sonia Gavasso, Morten Brun
Abstract:
In this paper, we propose a minimum spanning tree-based method for segmenting brain tumors. The proposed method performs interactive segmentation based on the minimum spanning tree without tuning parameters. The steps involve preprocessing, making a graph, constructing a minimum spanning tree, and a newly implemented way of interactively segmenting the region of interest. In the preprocessing step, a Gaussian filter is applied to 2D images to remove the noise. Then, the pixel neighbor graph is weighted by intensity differences and the corresponding minimum spanning tree is constructed. The image is loaded in an interactive window for segmenting the tumor. The region of interest and the background are selected by clicking to split the minimum spanning tree into two trees. One of these trees represents the region of interest and the other represents the background. Finally, the segmentation given by the two trees is visualized. The proposed method was tested by segmenting two different 2D brain T1-weighted magnetic resonance image data sets. The comparison between our results and the standard gold segmentation confirmed the validity of the minimum spanning tree approach. The proposed method is simple to implement and the results indicate that it is accurate and efficient.Keywords: brain tumor, brain tumor segmentation, minimum spanning tree, segmentation, image processing
Procedia PDF Downloads 1221048 Analyzing Medical Workflows Using Market Basket Analysis
Authors: Mohit Kumar, Mayur Betharia
Abstract:
Healthcare domain, with the emergence of Electronic Medical Record (EMR), collects a lot of data which have been attracting Data Mining expert’s interest. In the past, doctors have relied on their intuition while making critical clinical decisions. This paper presents the means to analyze the Medical workflows to get business insights out of huge dumped medical databases. Market Basket Analysis (MBA) which is a special data mining technique, has been widely used in marketing and e-commerce field to discover the association between products bought together by customers. It helps businesses in increasing their sales by analyzing the purchasing behavior of customers and pitching the right customer with the right product. This paper is an attempt to demonstrate Market Basket Analysis applications in healthcare. In particular, it discusses the Market Basket Analysis Algorithm ‘Apriori’ applications within healthcare in major areas such as analyzing the workflow of diagnostic procedures, Up-selling and Cross-selling of Healthcare Systems, designing healthcare systems more user-friendly. In the paper, we have demonstrated the MBA applications using Angiography Systems, but can be extrapolated to other modalities as well.Keywords: data mining, market basket analysis, healthcare applications, knowledge discovery in healthcare databases, customer relationship management, healthcare systems
Procedia PDF Downloads 1741047 The Role of Cyfra 21-1 in Diagnosing Non Small Cell Lung Cancer (NSCLC)
Authors: H. J. T. Kevin Mozes, Dyah Purnamasari
Abstract:
Background: Lung cancer accounted for the fourth most common cancer in Indonesia. 85% of all lung cancer cases are the Non-Small Cell Lung Cancer (NSCLC). The indistinct signs and symptoms of NSCLC sometimes lead to misdiagnosis. The gold standard assessment for the diagnosis of NSCLC is the histopathological biopsy, which is invasive. Cyfra 21-1 is a tumor marker, which can be found in the intermediate protein structure in the epitel. The accuracy of Cyfra 21-1 in diagnosing NSCLC is not yet known, so this report is made to seek the answer for the question above. Methods: Literature searching is done using online databases. Proquest and Pubmed are online databases being used in this report. Then, literature selection is done by excluding and including based on inclusion criterias and exclusion criterias. The selected literature is then being appraised using the criteria of validity, importance, and validity. Results: From six journals appraised, five of them are valid. Sensitivity value acquired from all five literature is ranging from 50-84.5 %, meanwhile the specificity is 87.8 %-94.4 %. Likelihood the ratio of all appraised literature is ranging from 5.09 -10.54, which categorized to Intermediate High. Conclusion: Serum Cyfra 21-1 is a sensitive and very specific tumor marker for diagnosis of non-small cell lung cancer (NSCLC).Keywords: cyfra 21-1, diagnosis, nonsmall cell lung cancer, NSCLC, tumor marker
Procedia PDF Downloads 2321046 A New Bound on the Average Information Ratio of Perfect Secret-Sharing Schemes for Access Structures Based on Bipartite Graphs of Larger Girth
Authors: Hui-Chuan Lu
Abstract:
In a perfect secret-sharing scheme, a dealer distributes a secret among a set of participants in such a way that only qualified subsets of participants can recover the secret and the joint share of the participants in any unqualified subset is statistically independent of the secret. The access structure of the scheme refers to the collection of all qualified subsets. In a graph-based access structures, each vertex of a graph G represents a participant and each edge of G represents a minimal qualified subset. The average information ratio of a perfect secret-sharing scheme realizing a given access structure is the ratio of the average length of the shares given to the participants to the length of the secret. The infimum of the average information ratio of all possible perfect secret-sharing schemes realizing an access structure is called the optimal average information ratio of that access structure. We study the optimal average information ratio of the access structures based on bipartite graphs. Based on some previous results, we give a bound on the optimal average information ratio for all bipartite graphs of girth at least six. This bound is the best possible for some classes of bipartite graphs using our approach.Keywords: secret-sharing scheme, average information ratio, star covering, deduction, core cluster
Procedia PDF Downloads 362