Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 27694

Search results for: visibility graph analysis

27454 A Polynomial Approach for a Graphical-based Integrated Production and Transport Scheduling with Capacity Restrictions

Abstract:

The performance of global manufacturing supply chains depends on the interaction of production and transport processes. Currently, the scheduling of these processes is done separately without considering mutual requirements, which leads to no optimal solutions. An integrated scheduling of both processes enables the improvement of supply chain performance. The integrated production and transport scheduling problem (PTSP) is NP-hard, so that heuristic methods are necessary to efficiently solve large problem instances as in the case of global manufacturing supply chains. This paper presents a heuristic scheduling approach which handles the integration of flexible production processes with intermodal transport, incorporating flexible land transport. The method is based on a graph that allows a reformulation of the PTSP as a shortest path problem for each job, which can be solved in polynomial time. The proposed method is applied to a supply chain scenario with a manufacturing facility in South Africa and shipments of finished product to customers within the Country. The obtained results show that the approach is suitable for the scheduling of large-scale problems and can be flexibly adapted to different scenarios.

Keywords: production and transport scheduling problem, graph based scheduling, integrated scheduling

Procedia PDF Downloads 464

27453 Knowledge Graph Development to Connect Earth Metadata and Standard English Queries

Authors: Gabriel Montague, Max Vilgalys, Catherine H. Crawford, Jorge Ortiz, Dava Newman

Abstract:

There has never been so much publicly accessible atmospheric and environmental data. The possibilities of these data are exciting, but the sheer volume of available datasets represents a new challenge for researchers. The task of identifying and working with a new dataset has become more difficult with the amount and variety of available data. Datasets are often documented in ways that differ substantially from the common English used to describe the same topics. This presents a barrier not only for new scientists, but for researchers looking to find comparisons across multiple datasets or specialists from other disciplines hoping to collaborate. This paper proposes a method for addressing this obstacle: creating a knowledge graph to bridge the gap between everyday English language and the technical language surrounding these datasets. Knowledge graph generation is already a well-established field, although there are some unique challenges posed by working with Earth data. One is the sheer size of the databases – it would be infeasible to replicate or analyze all the data stored by an organization like The National Aeronautics and Space Administration (NASA) or the European Space Agency. Instead, this approach identifies topics from metadata available for datasets in NASA’s Earthdata database, which can then be used to directly request and access the raw data from NASA. By starting with a single metadata standard, this paper establishes an approach that can be generalized to different databases, but leaves the challenge of metadata harmonization for future work. Topics generated from the metadata are then linked to topics from a collection of English queries through a variety of standard and custom natural language processing (NLP) methods. The results from this method are then compared to a baseline of elastic search applied to the metadata. This comparison shows the benefits of the proposed knowledge graph system over existing methods, particularly in interpreting natural language queries and interpreting topics in metadata. For the research community, this work introduces an application of NLP to the ecological and environmental sciences, expanding the possibilities of how machine learning can be applied in this discipline. But perhaps more importantly, it establishes the foundation for a platform that can enable common English to access knowledge that previously required considerable effort and experience. By making this public data accessible to the full public, this work has the potential to transform environmental understanding, engagement, and action.

Keywords: earth metadata, knowledge graphs, natural language processing, question-answer systems

Procedia PDF Downloads 135

27452 Real-Time Network Anomaly Detection Systems Based on Machine-Learning Algorithms

Authors: Zahra Ramezanpanah, Joachim Carvallo, Aurelien Rodriguez

Abstract:

This paper aims to detect anomalies in streaming data using machine learning algorithms. In this regard, we designed two separate pipelines and evaluated the effectiveness of each separately. The first pipeline, based on supervised machine learning methods, consists of two phases. In the first phase, we trained several supervised models using the UNSW-NB15 data-set. We measured the efficiency of each using different performance metrics and selected the best model for the second phase. At the beginning of the second phase, we first, using Argus Server, sniffed a local area network. Several types of attacks were simulated and then sent the sniffed data to a running algorithm at short intervals. This algorithm can display the results of each packet of received data in real-time using the trained model. The second pipeline presented in this paper is based on unsupervised algorithms, in which a Temporal Graph Network (TGN) is used to monitor a local network. The TGN is trained to predict the probability of future states of the network based on its past behavior. Our contribution in this section is introducing an indicator to identify anomalies from these predicted probabilities.

Keywords: temporal graph network, anomaly detection, cyber security, IDS

Procedia PDF Downloads 90

27451 Row Detection and Graph-Based Localization in Tree Nurseries Using a 3D LiDAR

Authors: Ionut Vintu, Stefan Laible, Ruth Schulz

Abstract:

Agricultural robotics has been developing steadily over recent years, with the goal of reducing and even eliminating pesticides used in crops and to increase productivity by taking over human labor. The majority of crops are arranged in rows. The first step towards autonomous robots, capable of driving in fields and performing crop-handling tasks, is for robots to robustly detect the rows of plants. Recent work done towards autonomous driving between plant rows offers big robotic platforms equipped with various expensive sensors as a solution to this problem. These platforms need to be driven over the rows of plants. This approach lacks flexibility and scalability when it comes to the height of plants or distance between rows. This paper proposes instead an algorithm that makes use of cheaper sensors and has a higher variability. The main application is in tree nurseries. Here, plant height can range from a few centimeters to a few meters. Moreover, trees are often removed, leading to gaps within the plant rows. The core idea is to combine row detection algorithms with graph-based localization methods as they are used in SLAM. Nodes in the graph represent the estimated pose of the robot, and the edges embed constraints between these poses or between the robot and certain landmarks. This setup aims to improve individual plant detection and deal with exception handling, like row gaps, which are falsely detected as an end of rows. Four methods were developed for detecting row structures in the fields, all using a point cloud acquired with a 3D LiDAR as an input. Comparing the field coverage and number of damaged plants, the method that uses a local map around the robot proved to perform the best, with 68% covered rows and 25% damaged plants. This method is further used and combined with a graph-based localization algorithm, which uses the local map features to estimate the robot’s position inside the greater field. Testing the upgraded algorithm in a variety of simulated fields shows that the additional information obtained from localization provides a boost in performance over methods that rely purely on perception to navigate. The final algorithm achieved a row coverage of 80% and an accuracy of 27% damaged plants. Future work would focus on achieving a perfect score of 100% covered rows and 0% damaged plants. The main challenges that the algorithm needs to overcome are fields where the height of the plants is too small for the plants to be detected and fields where it is hard to distinguish between individual plants when they are overlapping. The method was also tested on a real robot in a small field with artificial plants. The tests were performed using a small robot platform equipped with wheel encoders, an IMU and an FX10 3D LiDAR. Over ten runs, the system achieved 100% coverage and 0% damaged plants. The framework built within the scope of this work can be further used to integrate data from additional sensors, with the goal of achieving even better results.

Keywords: 3D LiDAR, agricultural robots, graph-based localization, row detection

Procedia PDF Downloads 127

27450 Language Inequalities in the Algerian Public Space: A Semiotic Landscape Analysis

Authors: Sarah Smail

Abstract:

Algeria has been subject to countless conquests and invasions that resulted in having a diverse linguistic repertoire. The sociolinguistic situation of the country made linguistic landscape analysis pertinent. This in fact, has led to the growth of diverse linguistic landscape studies that mainly focused on identifying the sociolinguistic situation of the country through shop names analysis. The present research adds to the existing literature by offering another perspective to the analysis of signs by combining the physical and digital semiotic landscape. The powerful oil, gas and agri-food industries in Algeria make it interesting to focus on the commodification of natural resources for the sake of identifying the language and semiotic resources deployed in the Algerian public scene in addition to the identification of the visibility of linguistic inequalities and minorities in the business domain. The study discusses the semiotic landscape of three trade cities: Bejaia, Setif and Hassi-Messaoud. In addition to interviews conducted with business owners and graphic designers and questionnaires with business employees. Withal, the study relies on Gorter’s multilingual inequalities in public space (MIPS) model (2021) and Irvine and Gal’s language ideology and linguistic differentiation (2000). The preliminary results demonstrate the sociolinguistic injustice existing in the business domain, e.g., the exclusion of the official languages, the dominance of foreign languages, and the excessive use of the roman script.

Keywords: semiotic landscaping, digital scapes, language commodification, linguistic inequalities, business signage

Procedia PDF Downloads 92

27449 Optimizing Resource Management in Cloud Computing through Blockchain-Enabled Cost Transparency

Authors: Raghava Satya SaiKrishna Dittakavi

Abstract:

Cloud computing has revolutionized how businesses and individuals store, access, and process data, increasing efficiency and reducing infrastructure costs. However, the need for more transparency in cloud service billing often raises concerns about overcharging and hidden fees, hindering the realization of the full potential of cloud computing. This research paper explores how blockchain technology can be leveraged to introduce cost transparency and accountability in cloud computing services. We present a comprehensive analysis of blockchain-enabled solutions that enhance cost visibility, facilitate auditability, and promote trust in cloud service providers. Through this study, we aim to provide insights into the potential benefits and challenges of implementing blockchain in the cloud computing domain, leading to improved cost management and customer satisfaction.

Keywords: blockchain, cloud computing, cost transparency, blockchain technology

Procedia PDF Downloads 73

27448 Citation Analysis of New Zealand Court Decisions

Authors: Tobias Milz, L. Macpherson, Varvara Vetrova

Abstract:

The law is a fundamental pillar of human societies as it shapes, controls and governs how humans conduct business, behave and interact with each other. Recent advances in computer-assisted technologies such as NLP, data science and AI are creating opportunities to support the practice, research and study of this pervasive domain. It is therefore not surprising that there has been an increase in investments into supporting technologies for the legal industry (also known as “legal tech” or “law tech”) over the last decade. A sub-discipline of particular appeal is concerned with assisted legal research. Supporting law researchers and practitioners to retrieve information from the vast amount of ever-growing legal documentation is of natural interest to the legal research community. One tool that has been in use for this purpose since the early nineteenth century is legal citation indexing. Among other use cases, they provided an effective means to discover new precedent cases. Nowadays, computer-assisted network analysis tools can allow for new and more efficient ways to reveal the “hidden” information that is conveyed through citation behavior. Unfortunately, access to openly available legal data is still lacking in New Zealand and access to such networks is only commercially available via providers such as LexisNexis. Consequently, there is a need to create, analyze and provide a legal citation network with sufficient data to support legal research tasks. This paper describes the development and analysis of a legal citation Network for New Zealand containing over 300.000 decisions from 125 different courts of all areas of law and jurisdiction. Using python, the authors assembled web crawlers, scrapers and an OCR pipeline to collect and convert court decisions from openly available sources such as NZLII into uniform and machine-readable text. This facilitated the use of regular expressions to identify references to other court decisions from within the decision text. The data was then imported into a graph-based database (Neo4j) with the courts and their respective cases represented as nodes and the extracted citations as links. Furthermore, additional links between courts of connected cases were added to indicate an indirect citation between the courts. Neo4j, as a graph-based database, allows efficient querying and use of network algorithms such as PageRank to reveal the most influential/most cited courts and court decisions over time. This paper shows that the in-degree distribution of the New Zealand legal citation network resembles a power-law distribution, which indicates a possible scale-free behavior of the network. This is in line with findings of the respective citation networks of the U.S. Supreme Court, Austria and Germany. The authors of this paper provide the database as an openly available data source to support further legal research. The decision texts can be exported from the database to be used for NLP-related legal research, while the network can be used for in-depth analysis. For example, users of the database can specify the network algorithms and metrics to only include specific courts to filter the results to the area of law of interest.

Keywords: case citation network, citation analysis, network analysis, Neo4j

Procedia PDF Downloads 94

27447 A Study on the Computation of Gourava Indices for Poly-L Lysine Dendrimer and Its Biomedical Applications

Authors: M. Helen

Abstract:

Chemical graph serves as a convenient model for any real or abstract chemical system. Dendrimers are novel three dimensional hyper branched globular nanopolymeric architectures. Drug delivery scientists are especially enthusiastic about possible utility of dendrimers as drug delivery tool. Dendrimers like poly L lysine (PLL), poly-propylene imine (PPI) and poly-amidoamine (PAMAM), etc., are used as gene carrier in drug delivery system because of their chemical characteristics. These characteristics of chemical compounds are analysed using topological indices (invariants under graph isomorphism) such as Wiener index, Zagreb index, etc., Prof. V. R. Kulli motivated by the application of Zagreb indices in finding the total π energy and derived Gourava indices which is an improved version over Zagreb indices. In this paper, we study the structure of PLL-Dendrimer that has the following applications: reduction in toxicity, colon delivery, and topical delivery. Also, we determine first and second Gourava indices, first and second hyper Gourava indices, product and sum connectivity Gourava indices for PLL-Dendrimer. Gourava Indices have found applications in Quantitative Structure-Property Relationship (QSPR)/ Quantitative Structure-Activity Relationship (QSAR) studies.

Keywords: connectivity Gourava indices, dendrimer, Gourava indices, hyper GouravaG indices

Procedia PDF Downloads 130

27446 SIPINA Induction Graph Method for Seismic Risk Prediction

Authors: B. Selma

Abstract:

The aim of this study is to test the feasibility of SIPINA method to predict the harmfulness parameters controlling the seismic response. The approach developed takes into consideration both the focal depth and the peak ground acceleration. The parameter to determine is displacement. The data used for the learning of this method and analysis nonlinear seismic are described and applied to a class of models damaged to some typical structures of the existing urban infrastructure of Jassy, Romania. The results obtained indicate an influence of the focal depth and the peak ground acceleration on the displacement.

Keywords: SIPINA algorithm, seism, focal depth, peak ground acceleration, displacement

Procedia PDF Downloads 299

27445 Characterization of the GntR Family Transcriptional Regulator Rv0792c: A Potential Drug Target for Mycobacterium tuberculosis

Authors: Thanusha D. Abeywickrama, Inoka C. Perera, Genji Kurisu

Abstract:

Tuberculosis, considered being as the ninth leading cause of death worldwide, cause from a single infectious agent M. tuberculosis and the drug resistance nature of this bacterium is a continuing threat to the world. Therefore TB preventing treatment is expanding, where this study designed to analyze the regulatory mechanism of GntR transcriptional regulator gene Rv0792c, which lie between several genes codes for some hypothetical proteins, a monooxygenase and an oxidoreductase. The gene encoding Rv0792c was cloned into pET28a and expressed protein was purified to near homogeneity by Nickel affinity chromatography. It was previously reported that the protein binds within the intergenic region (BS region) between Rv0792c gene and monooxygenase (Rv0793). This resulted in binding of three protein molecules with the BS region suggesting tight control of monooxygenase as well as its own gene. Since monooxygenase plays a key role in metabolism, this gene may have a global regulatory role. The natural ligand for this regulator is still under investigation. In relation to the Rv0792 protein structure, a Circular Dichroism (CD) spectrum was carried out to determine its secondary structure elements. Percentage-wise, 17.4% Helix, 21.8% Antiparallel, 5.1% Parallel, 12.3% turn and 43.5% other were revealed from CD spectrum data under room temperature. Differential Scanning Calorimetry (DSC) was conducted to assess the thermal stability of Rv0792, which the melting temperature of protein is 57.2 ± 0.6 °C. The graph of heat capacity (Cp) versus temperature for the best fit was obtained for non-two-state model, which concludes the folding of Rv0792 protein occurs through stable intermediates. Peak area (∆HCal ) and Peak shape (∆HVant ) was calculated from the graph and ∆HCal / ∆HVant was close to 0.5, suggesting dimeric nature of the protein.

Keywords: CD spectrum, DSC analysis, GntR transcriptional regulator, protein structure

Procedia PDF Downloads 212

27444 Optimizing the Location of Parking Areas Adapted for Dangerous Goods in the European Road Transport Network

Authors: María Dolores Caro, Eugenio M. Fedriani, Ángel F. Tenorio

Abstract:

The transportation of dangerous goods by lorries throughout Europe must be done by using the roads conforming the European Road Transport Network. In this network, there are several parking areas where lorry drivers can park to rest according to the regulations. According to the "European Agreement concerning the International Carriage of Dangerous Goods by Road", parking areas where lorries transporting dangerous goods can park to rest, must follow several security stipulations to keep safe the rest of road users. At this respect, these lorries must be parked in adapted areas with strict and permanent surveillance measures. Moreover, drivers must satisfy several restrictions about resting and driving time. Under these facts, one may expect that there exist enough parking areas for the transport of this type of goods in order to obey the regulations prescribed by the European Union and its member countries. However, the already-existing parking areas are not sufficient to cover all the stops required by drivers transporting dangerous goods. Our main goal is, starting from the already-existing parking areas and the loading-and-unloading location, to provide an optimal answer to the following question: how many additional parking areas must be built and where must they be located to assure that lorry drivers can transport dangerous goods following all the stipulations about security and safety for their stops? The sense of the word “optimal” is due to the fact that we give a global solution for the location of parking areas throughout the whole European Road Transport Network, adjusting the number of additional areas to be as lower as possible. To do so, we have modeled the problem using graph theory since we are working with a road network. As nodes, we have considered the locations of each already-existing parking area, each loading-and-unloading area each road bifurcation. Each road connecting two nodes is considered as an edge in the graph whose weight corresponds to the distance between both nodes in the edge. By applying a new efficient algorithm, we have found the additional nodes for the network representing the new parking areas adapted for dangerous goods, under the fact that the distance between two parking areas must be less than or equal to 400 km.

Keywords: trans-european transport network, dangerous goods, parking areas, graph-based modeling

Procedia PDF Downloads 267

27443 Optimization of Feeder Bus Routes at Urban Rail Transit Stations Based on Link Growth Probability

Authors: Yu Song, Yuefei Jin

Abstract:

Urban public transportation can be integrated when there is an efficient connection between urban rail lines, however, there are currently no effective or quick solutions being investigated for this connection. This paper analyzes the space-time distribution and travel demand of passenger connection travel based on taxi track data and data from the road network, excavates potential bus connection stations based on potential connection demand data, and introduces the link growth probability model in the complex network to solve the basic connection bus lines in order to ascertain the direction of the bus lines that are the most connected given the demand characteristics. Then, a tree view exhaustive approach based on constraints is suggested based on graph theory, which can hasten the convergence of findings while doing chain calculations. This study uses WEI QU NAN Station, the Xi'an Metro Line 2 terminal station in Shaanxi Province, as an illustration, to evaluate the model's and the solution method's efficacy. According to the findings, 153 prospective stations have been dug up in total, the feeder bus network for the entire line has been laid out, and the best route adjustment strategy has been found.

Keywords: feeder bus, route optimization, link growth probability, the graph theory

Procedia PDF Downloads 62

27442 Classification of Poverty Level Data in Indonesia Using the Naïve Bayes Method

Authors: Anung Style Bukhori, Ani Dijah Rahajoe

Abstract:

Poverty poses a significant challenge in Indonesia, requiring an effective analytical approach to understand and address this issue. In this research, we applied the Naïve Bayes classification method to examine and classify poverty data in Indonesia. The main focus is on classifying data using RapidMiner, a powerful data analysis platform. The analysis process involves data splitting to train and test the classification model. First, we collected and prepared a poverty dataset that includes various factors such as education, employment, and health..The experimental results indicate that the Naïve Bayes classification model can provide accurate predictions regarding the risk of poverty. The use of RapidMiner in the analysis process offers flexibility and efficiency in evaluating the model's performance. The classification produces several values to serve as the standard for classifying poverty data in Indonesia using Naive Bayes. The accuracy result obtained is 40.26%, with a moderate recall result of 35.94%, a high recall result of 63.16%, and a low recall result of 38.03%. The precision for the moderate class is 58.97%, for the high class is 17.39%, and for the low class is 58.70%. These results can be seen from the graph below.

Keywords: poverty, classification, naïve bayes, Indonesia

Procedia PDF Downloads 42

27441 Quantifying Multivariate Spatiotemporal Dynamics of Malaria Risk Using Graph-Based Optimization in Southern Ethiopia

Authors: Yonas Shuke Kitawa

Abstract:

Background: Although malaria incidence has substantially fallen sharply over the past few years, the rate of decline varies by district, time, and malaria type. Despite this turn-down, malaria remains a major public health threat in various districts of Ethiopia. Consequently, the present study is aimed at developing a predictive model that helps to identify the spatio-temporal variation in malaria risk by multiple plasmodium species. Methods: We propose a multivariate spatio-temporal Bayesian model to obtain a more coherent picture of the temporally varying spatial variation in disease risk. The spatial autocorrelation in such a data set is typically modeled by a set of random effects that assign a conditional autoregressive prior distribution. However, the autocorrelation considered in such cases depends on a binary neighborhood matrix specified through the border-sharing rule. Over here, we propose a graph-based optimization algorithm for estimating the neighborhood matrix that merely represents the spatial correlation by exploring the areal units as the vertices of a graph and the neighbor relations as the series of edges. Furthermore, we used aggregated malaria count in southern Ethiopia from August 2013 to May 2019. Results: We recognized that precipitation, temperature, and humidity are positively associated with the malaria threat in the area. On the other hand, enhanced vegetation index, nighttime light (NTL), and distance from coastal areas are negatively associated. Moreover, nonlinear relationships were observed between malaria incidence and precipitation, temperature, and NTL. Additionally, lagged effects of temperature and humidity have a significant effect on malaria risk by either species. More elevated risk of P. falciparum was observed following the rainy season, and unstable transmission of P. vivax was observed in the area. Finally, P. vivax risks are less sensitive to environmental factors than those of P. falciparum. Conclusion: The improved inference was gained by employing the proposed approach in comparison to the commonly used border-sharing rule. Additionally, different covariates are identified, including delayed effects, and elevated risks of either of the cases were observed in districts found in the central and western regions. As malaria transmission operates in a spatially continuous manner, a spatially continuous model should be employed when it is computationally feasible.

Keywords: disease mapping, MSTCAR, graph-based optimization algorithm, P. falciparum, P. vivax, waiting matrix

Procedia PDF Downloads 58

27440 Predicting Open Chromatin Regions in Cell-Free DNA Whole Genome Sequencing Data by Correlation Clustering

Authors: Fahimeh Palizban, Farshad Noravesh, Amir Hossein Saeidian, Mahya Mehrmohamadi

Abstract:

In the recent decade, the emergence of liquid biopsy has significantly improved cancer monitoring and detection. Dying cells, including those originating from tumors, shed their DNA into the blood and contribute to a pool of circulating fragments called cell-free DNA. Accordingly, identifying the tissue origin of these DNA fragments from the plasma can result in more accurate and fast disease diagnosis and precise treatment protocols. Open chromatin regions are important epigenetic features of DNA that reflect cell types of origin. Profiling these features by DNase-seq, ATAC-seq, and histone ChIP-seq provides insights into tissue-specific and disease-specific regulatory mechanisms. There have been several studies in the area of cancer liquid biopsy that integrate distinct genomic and epigenomic features for early cancer detection along with tissue of origin detection. However, multimodal analysis requires several types of experiments to cover the genomic and epigenomic aspects of a single sample, which will lead to a huge amount of cost and time. To overcome these limitations, the idea of predicting OCRs from WGS is of particular importance. In this regard, we proposed a computational approach to target the prediction of open chromatin regions as an important epigenetic feature from cell-free DNA whole genome sequence data. To fulfill this objective, local sequencing depth will be fed to our proposed algorithm and the prediction of the most probable open chromatin regions from whole genome sequencing data can be carried out. Our method integrates the signal processing method with sequencing depth data and includes count normalization, Discrete Fourie Transform conversion, graph construction, graph cut optimization by linear programming, and clustering. To validate the proposed method, we compared the output of the clustering (open chromatin region+, open chromatin region-) with previously validated open chromatin regions related to human blood samples of the ATAC-DB database. The percentage of overlap between predicted open chromatin regions and the experimentally validated regions obtained by ATAC-seq in ATAC-DB is greater than 67%, which indicates meaningful prediction. As it is evident, OCRs are mostly located in the transcription start sites (TSS) of the genes. In this regard, we compared the concordance between the predicted OCRs and the human genes TSS regions obtained from refTSS and it showed proper accordance around 52.04% and ~78% with all and the housekeeping genes, respectively. Accurately detecting open chromatin regions from plasma cell-free DNA-seq data is a very challenging computational problem due to the existence of several confounding factors, such as technical and biological variations. Although this approach is in its infancy, there has already been an attempt to apply it, which leads to a tool named OCRDetector with some restrictions like the need for highly depth cfDNA WGS data, prior information about OCRs distribution, and considering multiple features. However, we implemented a graph signal clustering based on a single depth feature in an unsupervised learning manner that resulted in faster performance and decent accuracy. Overall, we tried to investigate the epigenomic pattern of a cell-free DNA sample from a new computational perspective that can be used along with other tools to investigate genetic and epigenetic aspects of a single whole genome sequencing data for efficient liquid biopsy-related analysis.

Keywords: open chromatin regions, cancer, cell-free DNA, epigenomics, graph signal processing, correlation clustering

Procedia PDF Downloads 133

27439 Visualizing the Commercial Activity of a City by Analyzing the Data Information in Layers

Authors: Taras Agryzkov, Jose L. Oliver, Leandro Tortosa, Jose Vicent

Abstract:

This paper aims to demonstrate how network models can be used to understand and to deal with some aspects of urban complexity. As it is well known, the Theory of Architecture and Urbanism has been using for decades’ intellectual tools based on the ‘sciences of complexity’ as a strategy to propose theoretical approaches about cities and about architecture. In this sense, it is possible to find a vast literature in which for instance network theory is used as an instrument to understand very diverse questions about cities: from their commercial activity to their heritage condition. The contribution of this research consists in adding one step of complexity to this process: instead of working with one single primal graph as it is usually done, we will show how new network models arise from the consideration of two different primal graphs interacting in two layers. When we model an urban network through a mathematical structure like a graph, the city is usually represented by a set of nodes and edges that reproduce its topology, with the data generated or extracted from the city embedded in it. All this information is normally displayed in a single layer. Here, we propose to separate the information in two layers so that we can evaluate the interaction between them. Besides, both layers may be composed of structures that do not have to coincide: from this bi-layer system, groups of interactions emerge, suggesting reflections and in consequence, possible actions.

Keywords: graphs, mathematics, networks, urban studies

Procedia PDF Downloads 168

27438 Language Development and Growing Spanning Trees in Children Semantic Network

Authors: Somayeh Sadat Hashemi Kamangar, Fatemeh Bakouie, Shahriar Gharibzadeh

Abstract:

In this study, we target to exploit Maximum Spanning Trees (MST) of children's semantic networks to investigate their language development. To do so, we examine the graph-theoretic properties of word-embedding networks. The networks are made of words children learn prior to the age of 30 months as the nodes and the links which are built from the cosine vector similarity of words normatively acquired by children prior to two and a half years of age. These networks are weighted graphs and the strength of each link is determined by the numerical similarities of the two words (nodes) on the sides of the link. To avoid changing the weighted networks to the binaries by setting a threshold, constructing MSTs might present a solution. MST is a unique sub-graph that connects all the nodes in such a way that the sum of all the link weights is maximized without forming cycles. MSTs as the backbone of the semantic networks are suitable to examine developmental changes in semantic network topology in children. From these trees, several parameters were calculated to characterize the developmental change in network organization. We showed that MSTs provides an elegant method sensitive to capture subtle developmental changes in semantic network organization.

Keywords: maximum spanning trees, word-embedding, semantic networks, language development

Procedia PDF Downloads 130

27437 High-Fidelity Materials Screening with a Multi-Fidelity Graph Neural Network and Semi-Supervised Learning

Authors: Akeel A. Shah, Tong Zhang

Abstract:

Computational approaches to learning the properties of materials are commonplace, motivated by the need to screen or design materials for a given application, e.g., semiconductors and energy storage. Experimental approaches can be both time consuming and costly. Unfortunately, computational approaches such as ab-initio electronic structure calculations and classical or ab-initio molecular dynamics are themselves can be too slow for the rapid evaluation of materials, often involving thousands to hundreds of thousands of candidates. Machine learning assisted approaches have been developed to overcome the time limitations of purely physics-based approaches. These approaches, on the other hand, require large volumes of data for training (hundreds of thousands on many standard data sets such as QM7b). This means that they are limited by how quickly such a large data set of physics-based simulations can be established. At high fidelity, such as configuration interaction, composite methods such as G4, and coupled cluster theory, gathering such a large data set can become infeasible, which can compromise the accuracy of the predictions - many applications require high accuracy, for example band structures and energy levels in semiconductor materials and the energetics of charge transfer in energy storage materials. In order to circumvent this problem, multi-fidelity approaches can be adopted, for example the Δ-ML method, which learns a high-fidelity output from a low-fidelity result such as Hartree-Fock or density functional theory (DFT). The general strategy is to learn a map between the low and high fidelity outputs, so that the high-fidelity output is obtained a simple sum of the physics-based low-fidelity and correction, Although this requires a low-fidelity calculation, it typically requires far fewer high-fidelity results to learn the correction map, and furthermore, the low-fidelity result, such as Hartree-Fock or semi-empirical ZINDO, is typically quick to obtain, For high-fidelity outputs the result can be an order of magnitude or more in speed up. In this work, a new multi-fidelity approach is developed, based on a graph convolutional network (GCN) combined with semi-supervised learning. The GCN allows for the material or molecule to be represented as a graph, which is known to improve accuracy, for example SchNet and MEGNET. The graph incorporates information regarding the numbers of, types and properties of atoms; the types of bonds; and bond angles. They key to the accuracy in multi-fidelity methods, however, is the incorporation of low-fidelity output to learn the high-fidelity equivalent, in this case by learning their difference. Semi-supervised learning is employed to allow for different numbers of low and high-fidelity training points, by using an additional GCN-based low-fidelity map to predict high fidelity outputs. It is shown on 4 different data sets that a significant (at least one order of magnitude) increase in accuracy is obtained, using one to two orders of magnitude fewer low and high fidelity training points. One of the data sets is developed in this work, pertaining to 1000 simulations of quinone molecules (up to 24 atoms) at 5 different levels of fidelity, furnishing the energy, dipole moment and HOMO/LUMO.

Keywords: .materials screening, computational materials, machine learning, multi-fidelity, graph convolutional network, semi-supervised learning

Procedia PDF Downloads 12

27436 Topological Language for Classifying Linear Chord Diagrams via Intersection Graphs

Authors: Michela Quadrini

Abstract:

Chord diagrams occur in mathematics, from the study of RNA to knot theory. They are widely used in theory of knots and links for studying the finite type invariants, whereas in molecular biology one important motivation to study chord diagrams is to deal with the problem of RNA structure prediction. An RNA molecule is a linear polymer, referred to as the backbone, that consists of four types of nucleotides. Each nucleotide is represented by a point, whereas each chord of the diagram stands for one interaction for Watson-Crick base pairs between two nonconsecutive nucleotides. A chord diagram is an oriented circle with a set of n pairs of distinct points, considered up to orientation preserving diffeomorphisms of the circle. A linear chord diagram (LCD) is a special kind of graph obtained cutting the oriented circle of a chord diagram. It consists of a line segment, called its backbone, to which are attached a number of chords with distinct endpoints. There is a natural fattening on any linear chord diagram; the backbone lies on the real axis, while all the chords are in the upper half-plane. Each linear chord diagram has a natural genus of its associated surface. To each chord diagram and linear chord diagram, it is possible to associate the intersection graph. It consists of a graph whose vertices correspond to the chords of the diagram, whereas the chord intersections are represented by a connection between the vertices. Such intersection graph carries a lot of information about the diagram. Our goal is to define an LCD equivalence class in terms of identity of intersection graphs, from which many chord diagram invariants depend. For studying these invariants, we introduce a new representation of Linear Chord Diagrams based on a set of appropriate topological operators that permits to model LCD in terms of the relations among chords. Such set is composed of: crossing, nesting, and concatenations. The crossing operator is able to generate the whole space of linear chord diagrams, and a multiple context free grammar able to uniquely generate each LDC starting from a linear chord diagram adding a chord for each production of the grammar is defined. In other words, it allows to associate a unique algebraic term to each linear chord diagram, while the remaining operators allow to rewrite the term throughout a set of appropriate rewriting rules. Such rules define an LCD equivalence class in terms of the identity of intersection graphs. Starting from a modelled RNA molecule and the linear chord, some authors proposed a topological classification and folding. Our LCD equivalence class could contribute to the RNA folding problem leading to the definition of an algorithm that calculates the free energy of the molecule more accurately respect to the existing ones. Such LCD equivalence class could be useful to obtain a more accurate estimate of link between the crossing number and the topological genus and to study the relation among other invariants.

Keywords: chord diagrams, linear chord diagram, equivalence class, topological language

Procedia PDF Downloads 194

27435 Graph Clustering Unveiled: ClusterSyn - A Machine Learning Framework for Predicting Anti-Cancer Drug Synergy Scores

Authors: Babak Bahri, Fatemeh Yassaee Meybodi, Changiz Eslahchi

Abstract:

In the pursuit of effective cancer therapies, the exploration of combinatorial drug regimens is crucial to leverage synergistic interactions between drugs, thereby improving treatment efficacy and overcoming drug resistance. However, identifying synergistic drug pairs poses challenges due to the vast combinatorial space and limitations of experimental approaches. This study introduces ClusterSyn, a machine learning (ML)-powered framework for classifying anti-cancer drug synergy scores. ClusterSyn employs a two-step approach involving drug clustering and synergy score prediction using a fully connected deep neural network. For each cell line in the training dataset, a drug graph is constructed, with nodes representing drugs and edge weights denoting synergy scores between drug pairs. Drugs are clustered using the Markov clustering (MCL) algorithm, and vectors representing the similarity of drug pairs to each cluster are input into the deep neural network for synergy score prediction (synergy or antagonism). Clustering results demonstrate effective grouping of drugs based on synergy scores, aligning similar synergy profiles. Subsequently, neural network predictions and synergy scores of the two drugs on others within their clusters are used to predict the synergy score of the considered drug pair. This approach facilitates comparative analysis with clustering and regression-based methods, revealing the superior performance of ClusterSyn over state-of-the-art methods like DeepSynergy and DeepDDS on diverse datasets such as Oniel and Almanac. The results highlight the remarkable potential of ClusterSyn as a versatile tool for predicting anti-cancer drug synergy scores.

Keywords: drug synergy, clustering, prediction, machine learning., deep learning

Procedia PDF Downloads 62

27434 Comparison of Unit Hydrograph Models to Simulate Flood Events at the Field Scale

Authors: Imene Skhakhfa, Lahbaci Ouerdachi

Abstract:

To ensure the overall coherence of simulated results, it is necessary to develop a robust validation process. In many applications, it is no longer content to calibrate and validate the model only in relation to the hydro graph measured at the outlet, but we try to better simulate the functioning of the watershed in space. Therefore the timing also performs compared to other variables such as water level measurements in intermediate stations or groundwater levels. As part of this work, we limit ourselves to modeling flood of short duration for which the process of evapotranspiration is negligible. The main parameters to identify the models are related to the method of unit hydro graph (HU). Three different models were tested: SNYDER, CLARK and SCS. These models differ in their mathematical structure and parameters to be calibrated while hydrological data are the same, the initial water content and precipitation. The models are compared on the basis of their performance in terms six objective criteria, three global criteria and three criteria representing volume, peak flow, and the mean square error. The first type of criteria gives more weight to strong events whereas the second considers all events to be of equal weight. The results show that the calibrated parameter values are dependent and also highlight the problems associated with the simulation of low flow events and intermittent precipitation.

Keywords: model calibration, intensity, runoff, hydrograph

Procedia PDF Downloads 475

27433 Particle Size Analysis of Itagunmodi Southwestern Nigeria Alluvial Gold Ore Sample by Gaudin Schumann Method

Authors: Olaniyi Awe, Adelana R. Adetunji, Abraham Adeleke

Abstract:

Mining of alluvial gold ore by artisanal miners has been going on for decades at Itagunmodi, Southwestern Nigeria. In order to optimize the traditional panning gravity separation method commonly used in the area, a mineral particle size analysis study is critical. This study analyzed alluvial gold ore samples collected at identified five different locations in the area with a view to determine the ore particle size distributions. 500g measured of as-received alluvial gold ore sample was introduced into the uppermost sieve of an electrical sieve shaker consisting of sieves arranged in the order of decreasing nominal apertures of 5600μm, 3350μm, 2800μm, 355μm, 250μm, 125μm and 90μm, and operated for 20 minutes. The amount of material retained on each sieve was measured and tabulated for analysis. A screen analysis graph using the Gaudin Schuman method was drawn for each of the screen tests on the alluvial samples. The study showed that the percentages of fine particle size -125+90 μm fraction were 45.00%, 36.00%, 39.60%, 43.00% and 36.80% for the selected samples. These primary ore characteristic results provide reference data for the alluvial gold ore processing method selection, process performance measurement and optimization.

Keywords: alluvial gold ore, sieve shaker, particle size, Gaudin Schumann

Procedia PDF Downloads 32

27432 The Landscape of Multilingualism in the Urban Community of Limassol

Authors: Antigoni Parmaxi, Anna Nicolaou, Salomi Papadima-Sophocleous, Dimitrios Boglou

Abstract:

This study provides an overview of the socio linguistic situation of an under-researched city, Limassol, Cyprus, with regard to multilingualism and plurilingualism. More specifically, it explores issues pertaining to multilingualism and plurilingualism in education, the public sphere, economic life, the private sphere, and urban spaces. Through an examination of Limassol’s history of language diversity, as well as through an analysis of the city from a contemporary point of view, the study attempts to portray the multilingual Limassol of yesterday and of today. Findings demonstrate several aspects of multilingualism, such as how communication is achieved among the citizens, how the city encourages multilingualism, as well as what policies and practices are implemented in the various spheres in order to promote intercultural dialogue and mutual understanding. As a result of the findings, suggestions for best practices, introduction or improvement of policies and visions of the city are put forward.

Keywords: language diversity, social inclusion, multilingualism, language visibility, language policy

Procedia PDF Downloads 462

27431 A Tool for Facilitating an Institutional Risk Profile Definition

Authors: Roman Graf, Sergiu Gordea, Heather M. Ryan

Abstract:

This paper presents an approach for the easy creation of an institutional risk profile for endangerment analysis of file formats. The main contribution of this work is the employment of data mining techniques to support risk factors set up with just the most important values that are important for a particular organisation. Subsequently, the risk profile employs fuzzy models and associated configurations for the file format metadata aggregator to support digital preservation experts with a semi-automatic estimation of endangerment level for file formats. Our goal is to make use of a domain expert knowledge base aggregated from a digital preservation survey in order to detect preservation risks for a particular institution. Another contribution is support for visualisation and analysis of risk factors for a requried dimension. The proposed methods improve the visibility of risk factor information and the quality of a digital preservation process. The presented approach is meant to facilitate decision making for the preservation of digital content in libraries and archives using domain expert knowledge and automatically aggregated file format metadata from linked open data sources. To facilitate decision-making, the aggregated information about the risk factors is presented as a multidimensional vector. The goal is to visualise particular dimensions of this vector for analysis by an expert. The sample risk profile calculation and the visualisation of some risk factor dimensions is presented in the evaluation section.

Keywords: digital information management, file format, endangerment analysis, fuzzy models

Procedia PDF Downloads 392

27430 Thermal Analysis of Automobile Radiator Using Nanofluids

Authors: S. Sumanth, Babu Rao Ponangi, K. N. Seetharamu

Abstract:

As the technology is emerging day by day, there is a need for some better methodology which will enhance the performance of radiator. Nanofluid is the one area which has promised the enhancement of the radiator performance. Currently, nanofluid has got a well effective solution for enhancing the performance of the automobile radiators. Suspending the nano sized particle in the base fluid, which has got better thermal conductivity value when compared to a base fluid, is preferably considered for nanofluid. In the current work, at first mathematical formulation has been carried out, which will govern the performance of the radiator. Current work is justified by plotting the graph for different parameters. Current work justifies the enhancement of radiator performance using nanofluid.

Keywords: nanofluid, radiator performance, graphene, gamma aluminium oxide (γ-Al2O3), titanium dioxide (TiO2)

Procedia PDF Downloads 236

27429 Web Page Design Optimisation Based on Segment Analytics

Authors: Varsha V. Rohini, P. R. Shreya, B. Renukadevi

Abstract:

In the web analytics the information delivery and the web usage is optimized and the analysis of data is done. The analytics is the measurement, collection and analysis of webpage data. Page statistics and user metrics are the important factor in most of the web analytics tool. This is the limitation of the existing tools. It does not provide design inputs for the optimization of information. This paper aims at providing an extension for the scope of web analytics to provide analysis and statistics of each segment of a webpage. The number of click count is calculated and the concentration of links in a web page is obtained. Its user metrics are used to help in proper design of the displayed content in a webpage by Vision Based Page Segmentation (VIPS) algorithm. When the algorithm is applied on the web page it divides the entire web page into the visual block tree. The visual block tree generated will further divide the web page into visual blocks or segments which help us to understand the usage of each segment in a page and its content. The dynamic web pages and deep web pages are used to extend the scope of web page segment analytics. Space optimization concept is used with the help of the output obtained from the Vision Based Page Segmentation (VIPS) algorithm. This technique provides us the visibility of the user interaction with the WebPages and helps us to place the important links in the appropriate segments of the webpage and effectively manage space in a page and the concentration of links.

Keywords: analytics, design optimization, visual block trees, vision based technology

Procedia PDF Downloads 258

27428 Radar Charts Analysis to Compare the Level of Innovation in Mexico with Most Innovative Countries in Triple Helix Schema Economic and Human Factor Dimension

Authors: M. Peña Aguilar Juan, Valencia Luis, Pastrana Alberto, Nava Estefany, A. Martinez, M. Vivanco, A. Castañeda

Abstract:

This paper seeks to compare the innovation of Mexico from an economic and human perspective, with the seven most innovative countries according to the Global Innovation Index 2013, done by the World Intellectual Property Organization (WIPO). The above analysis suggests nine dimensions: Expenditure on R & D, intellectual property, appropriate environment to conduct business, economic stability, and triple helix for R & D, ICT Infrastructure, education, human resources and quality of life. Each dimension is represented by an indicator which is later used to construct a radial graph that compares the innovative capacity of the countries analysed. As a result, it is proposed a new indicator of innovation called The Area of Innovation. Observations are made from the results, and finally as a conclusion, those items or dimensions in which Mexico suffers lag in innovation are identify.

Keywords: dimension, measure, innovation level, economy, radar chart

Procedia PDF Downloads 454

27427 Post Liberal Perspective on Minorities Visibility in Contemporary Visual Culture: The Case of Mizrahi Jews

Authors: Merav Alush Levron, Sivan Rajuan Shtang

Abstract:

From as early as their emergence in Europe and the US, postmodern and post-colonial paradigm have formed the backbone of the visual culture field of study. The self-representation project of political minorities is studied, described and explained within the premises and perspectives drawn from these paradigms, addressing the key issues they had raised: modernism’s crisis of representation. The struggle for self-representation, agency and multicultural visibility sought to challenge the liberal pretense of universality and equality, hitting at its different blind spots, on issues such as class, gender, race, sex, and nationality. This struggle yielded subversive identity and hybrid performances, including reclaiming, mimicry and masquerading. These performances sought to defy the uniform, universal self, which forms the basis for the liberal, rational, enlightened subject. The argument of this research runs that this politics of representation itself is confined within liberal thought. Alongside post-colonialism and multiculturalism’s contribution in undermining oppressive structures of power, generating diversity in cultural visibility, and exposing the failure of liberal colorblindness, this subversion is constituted in the visual field by way of confrontation, flying in the face of the universal law and relying on its ongoing comparison and attribution to this law. Relying on Deleuze and Guattari, this research set out to draw theoretic and empiric attention to an alternative, post-liberal occurrence which has been taking place in the visual field in parallel to the contra-hegemonic phase and as a product of political reality in the aftermath of the crisis of representation. It is no longer a counter-representation; rather, it is a motion of organic minor desire, progressing in the form of flows and generating what Deleuze and Guattari termed deterritorialization of social structures. This discussion shall have its focus on current post-liberal performances of ‘Mizrahim’ (Jewish Israelis of Arab and Muslim extraction) in the visual field in Israel. In television, video art and photography, these performances challenge the issue of representation and generate concrete peripheral Mizrahiness, realized in the visual organization of the photographic frame. Mizrahiness then transforms from ‘confrontational’ representation into a 'presence', flooding the visual sphere in our plain sight, in a process of 'becoming'. The Mizrahi desire is exerted on the plains of sound, spoken language, the body and the space where they appear. It removes from these plains the coding and stratification engendered by European dominance and rational, liberal enlightenment. This stratification, adhering to the hegemonic surface, is flooded not by way of resisting false consciousness or employing hybridity, but by way of the Mizrahi identity’s own productive, material immanent yearning. The Mizrahi desire reverberates with Mizrahi peripheral 'worlds of meaning', where post-colonial interpretation almost invariably identifies a product of internalized oppression, and a recurrence thereof, rather than a source in itself - an ‘offshoot, never a wellspring’, as Nissim Mizrachi clarifies in his recent pioneering work. The peripheral Mizrahi performance ‘unhook itself’, in Deleuze and Guattari words, from the point of subjectification and interpretation and does not correspond with the partialness, absence, and split that mark post-colonial identities.

Keywords: desire, minority, Mizrahi Jews, post-colonialism, post-liberalism, visibility, Deleuze and Guattari

Procedia PDF Downloads 315

27426 Malware Beaconing Detection by Mining Large-scale DNS Logs for Targeted Attack Identification

Authors: Andrii Shalaginov, Katrin Franke, Xiongwei Huang

Abstract:

One of the leading problems in Cyber Security today is the emergence of targeted attacks conducted by adversaries with access to sophisticated tools. These attacks usually steal senior level employee system privileges, in order to gain unauthorized access to confidential knowledge and valuable intellectual property. Malware used for initial compromise of the systems are sophisticated and may target zero-day vulnerabilities. In this work we utilize common behaviour of malware called ”beacon”, which implies that infected hosts communicate to Command and Control servers at regular intervals that have relatively small time variations. By analysing such beacon activity through passive network monitoring, it is possible to detect potential malware infections. So, we focus on time gaps as indicators of possible C2 activity in targeted enterprise networks. We represent DNS log files as a graph, whose vertices are destination domains and edges are timestamps. Then by using four periodicity detection algorithms for each pair of internal-external communications, we check timestamp sequences to identify the beacon activities. Finally, based on the graph structure, we infer the existence of other infected hosts and malicious domains enrolled in the attack activities.

Keywords: malware detection, network security, targeted attack, computational intelligence

Procedia PDF Downloads 249

27425 Genderqueerness in Polish: A Survey-Based Study of Linguistic Strategies Employed by Genderqueer Speakers of Polish

Authors: Szymon Misiek

Abstract:

The genderqueer (or gender non-binary, both terms referring to those individuals who are identified as neither men nor women) community has been gaining greater visibility over the last few years. This includes legal recognition, representation in popular media, and inclusion of non-binary perspectives in research on transgender issues. Another important aspect of visibility is language. Gender-neutrality, often associated with genderqueer people, is relatively easy to achieve in natural-gender languages such as English. This can be observed in the growing popularity of the 'singular they' pronoun (used specifically with reference to genderqueer individuals) or the gender-neutral title 'Mx.' (as an alternative to 'Ms./Mr.'). 'Singular they' seems to have become a certain standard in the genderqueer community. Grammatical-gender languages, such as Polish, provide for a greater challenge to genderqueer speakers. In Polish, every noun is inherently gendered, while verbs, adjectives, and pronouns inflect for gender. Those who do not wish to settle for using only either masculine or feminine forms (which some genderqueer Polish speakers do choose) have to somehow mix the two, attempt to avoid gendered forms altogether, or turn to non-standard forms, such as neuter (not used for people in standard Polish), plurals (vaguely akin to English 'singular they'), or neologisms (such as verb forms using the '-u-' affix). The following paper presents the results of a survey conducted among genderqueer speakers of Polish regarding their choice of linguistic strategies. As no definitive standard such as 'singular they' has (yet) emerged, it rather seeks to emphasize the diversity of chosen strategies and their relation to a person's specific identity as well as the context an exchange takes place. The findings of the study may offer an insight into how heavily gendered languages deal with non-normatively gendered experiences, and to what extent English influences this process (e.g., the majority of genderqueer poles choose English terms to label their identity), as well as help design good practices aimed at achieving gender-equality in speech.

Keywords: genderqueer, grammatical gender in Polish, non-binary, transgender

Procedia PDF Downloads 121