Search results for: attributed graph clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2015

Search results for: attributed graph clustering

1865 QoS-CBMG: A Model for e-Commerce Customer Behavior

Authors: Hoda Ghavamipoor, S. Alireza Hashemi Golpayegani

Abstract:

An approach to model the customer interaction with e-commerce websites is presented. Considering the service quality level as a predictive feature, we offer an improved method based on the Customer Behavior Model Graph (CBMG), a state-transition graph model. To derive the Quality of Service sensitive-CBMG (QoS-CBMG) model, process-mining techniques is applied to pre-processed website server logs which are categorized as ‘buy’ or ‘visit’. Experimental results on an e-commerce website data confirmed that the proposed method outperforms CBMG based method.

Keywords: customer behavior model, electronic commerce, quality of service, customer behavior model graph, process mining

Procedia PDF Downloads 386
1864 The Problems of Current Earth Coordinate System for Earthquake Forecasting Using Single Layer Hierarchical Graph Neuron

Authors: Benny Benyamin Nasution, Rahmat Widia Sembiring, Abdul Rahman Dalimunthe, Nursiah Mustari, Nisfan Bahri, Berta br Ginting, Riadil Akhir Lubis, Rita Tavip Megawati, Indri Dithisari

Abstract:

The earth coordinate system is an important part of an attempt for earthquake forecasting, such as the one using Single Layer Hierarchical Graph Neuron (SLHGN). However, there are a number of problems that need to be worked out before the coordinate system can be utilized for the forecaster. One example of those is that SLHGN requires that the focused area of an earthquake must be constructed in a grid-like form. In fact, within the current earth coordinate system, the same longitude-difference would produce different distances. This can be observed at the distance on the Equator compared to distance at both poles. To deal with such a problem, a coordinate system has been developed, so that it can be used to support the ongoing earthquake forecasting using SLHGN. Two important issues have been developed in this system: 1) each location is not represented through two-value (longitude and latitude), but only a single value, 2) the conversion of the earth coordinate system to the x-y cartesian system requires no angular formulas, which is therefore fast. The accuracy and the performance have not been measured yet, since earthquake data is difficult to obtain. However, the characteristics of the SLHGN results show a very promising answer.

Keywords: hierarchical graph neuron, multidimensional hierarchical graph neuron, single layer hierarchical graph neuron, natural disaster forecasting, earthquake forecasting, earth coordinate system

Procedia PDF Downloads 195
1863 GRCNN: Graph Recognition Convolutional Neural Network for Synthesizing Programs from Flow Charts

Authors: Lin Cheng, Zijiang Yang

Abstract:

Program synthesis is the task to automatically generate programs based on user specification. In this paper, we present a framework that synthesizes programs from flow charts that serve as accurate and intuitive specification. In order doing so, we propose a deep neural network called GRCNN that recognizes graph structure from its image. GRCNN is trained end-to-end, which can predict edge and node information of the flow chart simultaneously. Experiments show that the accuracy rate to synthesize a program is 66.4%, and the accuracy rates to recognize edge and node are 94.1% and 67.9%, respectively. On average, it takes about 60 milliseconds to synthesize a program.

Keywords: program synthesis, flow chart, specification, graph recognition, CNN

Procedia PDF Downloads 99
1862 Mostar Type Indices and QSPR Analysis of Octane Isomers

Authors: B. Roopa Sri, Y Lakshmi Naidu

Abstract:

Chemical Graph Theory (CGT) is the branch of mathematical chemistry in which molecules are modeled to study their physicochemical properties using molecular descriptors. Amongst these descriptors, topological indices play a vital role in predicting the properties by defining the graph topology of the molecule. Recently, the bond-additive topological index known as the Mostar index has been proposed. In this paper, we compute the Mostar-type indices of octane isomers and use the data obtained to perform QSPR analysis. Furthermore, we show the correlation between the Mostar type indices and the properties.

Keywords: chemical graph theory, mostar type indices, octane isomers, qspr analysis, topological index

Procedia PDF Downloads 106
1861 Harmonic Data Preparation for Clustering and Classification

Authors: Ali Asheibi

Abstract:

The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.

Keywords: data mining, harmonic data, clustering, classification

Procedia PDF Downloads 221
1860 Drug-Drug Interaction Prediction in Diabetes Mellitus

Authors: Rashini Maduka, C. R. Wijesinghe, A. R. Weerasinghe

Abstract:

Drug-drug interactions (DDIs) can happen when two or more drugs are taken together. Today DDIs have become a serious health issue due to adverse drug effects. In vivo and in vitro methods for identifying DDIs are time-consuming and costly. Therefore, in-silico-based approaches are preferred in DDI identification. Most machine learning models for DDI prediction are used chemical and biological drug properties as features. However, some drug features are not available and costly to extract. Therefore, it is better to make automatic feature engineering. Furthermore, people who have diabetes already suffer from other diseases and take more than one medicine together. Then adverse drug effects may happen to diabetic patients and cause unpleasant reactions in the body. In this study, we present a model with a graph convolutional autoencoder and a graph decoder using a dataset from DrugBank version 5.1.3. The main objective of the model is to identify unknown interactions between antidiabetic drugs and the drugs taken by diabetic patients for other diseases. We considered automatic feature engineering and used Known DDIs only as the input for the model. Our model has achieved 0.86 in AUC and 0.86 in AP.

Keywords: drug-drug interaction prediction, graph embedding, graph convolutional networks, adverse drug effects

Procedia PDF Downloads 71
1859 Scheduling in Cloud Networks Using Chakoos Algorithm

Authors: Masoumeh Ali Pouri, Hamid Haj Seyyed Javadi

Abstract:

Nowadays, cloud processing is one of the important issues in information technology. Since scheduling of tasks graph is an NP-hard problem, considering approaches based on undeterminisitic methods such as evolutionary processing, mostly genetic and cuckoo algorithms, will be effective. Therefore, an efficient algorithm has been proposed for scheduling of tasks graph to obtain an appropriate scheduling with minimum time. In this algorithm, the new approach is based on making the length of the critical path shorter and reducing the cost of communication. Finally, the results obtained from the implementation of the presented method show that this algorithm acts the same as other algorithms when it faces graphs without communication cost. It performs quicker and better than some algorithms like DSC and MCP algorithms when it faces the graphs involving communication cost.

Keywords: cloud computing, scheduling, tasks graph, chakoos algorithm

Procedia PDF Downloads 36
1858 Nonlinear Evolution on Graphs

Authors: Benniche Omar

Abstract:

We are concerned with abstract fully nonlinear differential equations having the form y’(t)=Ay(t)+f(t,y(t)) where A is an m—dissipative operator (possibly multi—valued) defined on a subset D(A) of a Banach space X with values in X and f is a given function defined on I×X with values in X. We consider a graph K in I×X. We recall that K is said to be viable with respect to the above abstract differential equation if for each initial data in K there exists at least one trajectory starting from that initial data and remaining in K at least for a short time. The viability problem has been studied by many authors by using various techniques and frames. If K is closed, it is shown that a tangency condition, which is mainly linked to the dynamic, is crucial for viability. In the case when X is infinite dimensional, compactness and convexity assumptions are needed. In this paper, we are concerned with the notion of near viability for a given graph K with respect to y’(t)=Ay(t)+f(t,y(t)). Roughly speaking, the graph K is said to be near viable with respect to y’(t)=Ay(t)+f(t,y(t)), if for each initial data in K there exists at least one trajectory remaining arbitrary close to K at least for short time. It is interesting to note that the near viability is equivalent to an appropriate tangency condition under mild assumptions on the dynamic. Adding natural convexity and compactness assumptions on the dynamic, we may recover the (exact) viability. Here we investigate near viability for a graph K in I×X with respect to y’(t)=Ay(t)+f(t,y(t)) where A and f are as above. We emphasis that the t—dependence on the perturbation f leads us to introduce a new tangency concept. In the base of a tangency conditions expressed in terms of that tangency concept, we formulate criteria for K to be near viable with respect to y’(t)=Ay(t)+f(t,y(t)). As application, an abstract null—controllability theorem is given.

Keywords: abstract differential equation, graph, tangency condition, viability

Procedia PDF Downloads 119
1857 Identification of Biological Pathways Causative for Breast Cancer Using Unsupervised Machine Learning

Authors: Karthik Mittal

Abstract:

This study performs an unsupervised machine learning analysis to find clusters of related SNPs which highlight biological pathways that are important for the biological mechanisms of breast cancer. Studying genetic variations in isolation is illogical because these genetic variations are known to modulate protein production and function; the downstream effects of these modifications on biological outcomes are highly interconnected. After extracting the SNPs and their effect on different types of breast cancer using the MRBase library, two unsupervised machine learning clustering algorithms were implemented on the genetic variants: a k-means clustering algorithm and a hierarchical clustering algorithm; furthermore, principal component analysis was executed to visually represent the data. These algorithms specifically used the SNP’s beta value on the three different types of breast cancer tested in this project (estrogen-receptor positive breast cancer, estrogen-receptor negative breast cancer, and breast cancer in general) to perform this clustering. Two significant genetic pathways validated the clustering produced by this project: the MAPK signaling pathway and the connection between the BRCA2 gene and the ESR1 gene. This study provides the first proof of concept showing the importance of unsupervised machine learning in interpreting GWAS summary statistics.

Keywords: breast cancer, computational biology, unsupervised machine learning, k-means, PCA

Procedia PDF Downloads 118
1856 Energy-Efficient Clustering Protocol in Wireless Sensor Networks for Healthcare Monitoring

Authors: Ebrahim Farahmand, Ali Mahani

Abstract:

Wireless sensor networks (WSNs) can facilitate continuous monitoring of patients and increase early detection of emergency conditions and diseases. High density WSNs helps us to accurately monitor a remote environment by intelligently combining the data from the individual nodes. Due to energy capacity limitation of sensors, enhancing the lifetime and the reliability of WSNs are important factors in designing of these networks. The clustering strategies are verified as effective and practical algorithms for reducing energy consumption in WSNs and can tackle WSNs limitations. In this paper, an Energy-efficient weight-based Clustering Protocol (EWCP) is presented. Artificial retina is selected as a case study of WSNs applied in body sensors. Cluster heads’ (CHs) selection is equipped with energy efficient parameters. Moreover, cluster members are selected based on their distance to the selected CHs. Comparing with the other benchmark protocols, the lifetime of EWCP is improved significantly.

Keywords: WSN, healthcare monitoring, weighted based clustering, lifetime

Procedia PDF Downloads 289
1855 Modeling of Bioelectric Activity of Nerve Cells Using Bond Graph Method

Authors: M. Ghasemi, F. Eskandari, B. Hamzehei, A. R. Arshi

Abstract:

Bioelectric activity of nervous cells might be changed causing by various factors. This alteration can lead to unforeseen circumstances in other organs of the body. Therefore, the purpose of this study was to model a single neuron and its behavior under an initial stimulation. This study was developed based on cable theory by means of the Bond Graph method. The numerical values of the parameters were derived from empirical studies of cellular electrophysiology experiments. Initial excitation was applied through square current functions, and the resulted action potential was estimated along the neuron. The results revealed that the model was developed in this research adapted with the results of experimental studies and demonstrated the electrical behavior of nervous cells properly.

Keywords: bond graph, stimulation, nervous cells, modeling

Procedia PDF Downloads 402
1854 Clustering Based Level Set Evaluation for Low Contrast Images

Authors: Bikshalu Kalagadda, Srikanth Rangu

Abstract:

The important object of images segmentation is to extract objects with respect to some input features. One of the important methods for image segmentation is Level set method. Generally medical images and synthetic images with low contrast of pixel profile, for such images difficult to locate interested features in images. In conventional level set function, develops irregularity during its process of evaluation of contour of objects, this destroy the stability of evolution process. For this problem a remedy is proposed, a new hybrid algorithm is Clustering Level Set Evolution. Kernel fuzzy particles swarm optimization clustering with the Distance Regularized Level Set (DRLS) and Selective Binary, and Gaussian Filtering Regularized Level Set (SBGFRLS) methods are used. The ability of identifying different regions becomes easy with improved speed. Efficiency of the modified method can be evaluated by comparing with the previous method for similar specifications. Comparison can be carried out by considering medical and synthetic images.

Keywords: segmentation, clustering, level set function, re-initialization, Kernel fuzzy, swarm optimization

Procedia PDF Downloads 329
1853 Robust Diagnosis Efficiency by Bond-Graph Approach

Authors: Benazzouz Djamel, Termeche Adel, Touati Youcef, Alem Said, Ouziala Mahdi

Abstract:

This paper presents an approach which detect and isolate efficiently a fault in a system. This approach avoids false alarms, non-detections and delays in detecting faults. A study case have been proposed to show the importance of taking into consideration the uncertainties in the decision-making procedure and their effect on the degradation diagnostic performance and advantage of using Bond Graph (BG) for such degradation. The use of BG in the Linear Fractional Transformation (LFT) form allows generating robust Analytical Redundancy Relations (ARR’s), where the uncertain part of ARR’s is used to generate the residuals adaptive thresholds. The study case concerns an electromechanical system composed of a motor, a reducer and an external load. The aim of this application is to show the effectiveness of the BG-LFT approach to robust fault detection.

Keywords: bond graph, LFT, uncertainties, detection and faults isolation, ARR

Procedia PDF Downloads 281
1852 Component Based Testing Using Clustering and Support Vector Machine

Authors: Iqbaldeep Kaur, Amarjeet Kaur

Abstract:

Software Reusability is important part of software development. So component based software development in case of software testing has gained a lot of practical importance in the field of software engineering from academic researcher and also from software development industry perspective. Finding test cases for efficient reuse of test cases is one of the important problems aimed by researcher. Clustering reduce the search space, reuse test cases by grouping similar entities according to requirements ensuring reduced time complexity as it reduce the search time for retrieval the test cases. In this research paper we proposed approach for re-usability of test cases by unsupervised approach. In unsupervised learning we proposed k-mean and Support Vector Machine. We have designed the algorithm for requirement and test case document clustering according to its tf-idf vector space and the output is set of highly cohesive pattern groups.

Keywords: software testing, reusability, clustering, k-mean, SVM

Procedia PDF Downloads 402
1851 Clustering Performance Analysis using New Correlation-Based Cluster Validity Indices

Authors: Nathakhun Wiroonsri

Abstract:

There are various cluster validity measures used for evaluating clustering results. One of the main objectives of using these measures is to seek the optimal unknown number of clusters. Some measures work well for clusters with different densities, sizes and shapes. Yet, one of the weaknesses that those validity measures share is that they sometimes provide only one clear optimal number of clusters. That number is actually unknown and there might be more than one potential sub-optimal option that a user may wish to choose based on different applications. We develop two new cluster validity indices based on a correlation between an actual distance between a pair of data points and a centroid distance of clusters that the two points are located in. Our proposed indices constantly yield several peaks at different numbers of clusters which overcome the weakness previously stated. Furthermore, the introduced correlation can also be used for evaluating the quality of a selected clustering result. Several experiments in different scenarios, including the well-known iris data set and a real-world marketing application, have been conducted to compare the proposed validity indices with several well-known ones.

Keywords: clustering algorithm, cluster validity measure, correlation, data partitions, iris data set, marketing, pattern recognition

Procedia PDF Downloads 83
1850 GeneNet: Temporal Graph Data Visualization for Gene Nomenclature and Relationships

Authors: Jake Gonzalez, Tommy Dang

Abstract:

This paper proposes a temporal graph approach to visualize and analyze the evolution of gene relationships and nomenclature over time. An interactive web-based tool implements this temporal graph, enabling researchers to traverse a timeline and observe coupled dynamics in network topology and naming conventions. Analysis of a real human genomic dataset reveals the emergence of densely interconnected functional modules over time, representing groups of genes involved in key biological processes. For example, the antimicrobial peptide DEFA1A3 shows increased connections to related alpha-defensins involved in infection response. Tracking degree and betweenness centrality shifts over timeline iterations also quantitatively highlight the reprioritization of certain genes’ topological importance as knowledge advances. Examination of the CNR1 gene encoding the cannabinoid receptor CB1 demonstrates changing synonymous relationships and consolidating naming patterns over time, reflecting its unique functional role discovery. The integrated framework interconnecting these topological and nomenclature dynamics provides richer contextual insights compared to isolated analysis methods. Overall, this temporal graph approach enables a more holistic study of knowledge evolution to elucidate complex biology.

Keywords: temporal graph, gene relationships, nomenclature evolution, interactive visualization, biological insights

Procedia PDF Downloads 35
1849 Personalize E-Learning System Based on Clustering and Sequence Pattern Mining Approach

Authors: H. S. Saini, K. Vijayalakshmi, Rishi Sayal

Abstract:

Network-based education has been growing rapidly in size and quality. Knowledge clustering becomes more important in personalized information retrieval for web-learning. A personalized-Learning service after the learners’ knowledge has been classified with clustering. Through automatic analysis of learners’ behaviors, their partition with similar data level and interests may be discovered so as to produce learners with contents that best match educational needs for collaborative learning. We present a specific mining tool and a recommender engine that we have integrated in the online learning in order to help the teacher to carry out the whole e-learning process. We propose to use sequential pattern mining algorithms to discover the most used path by the students and from this information can recommend links to the new students automatically meanwhile they browse in the course. We have Developed a specific author tool in order to help the teacher to apply all the data mining process. We tend to report on many experiments with real knowledge so as to indicate the quality of using both clustering and sequential pattern mining algorithms together for discovering personalized e-learning systems.

Keywords: e-learning, cluster, personalization, sequence, pattern

Procedia PDF Downloads 403
1848 Graph-Based Semantical Extractive Text Analysis

Authors: Mina Samizadeh

Abstract:

In the past few decades, there has been an explosion in the amount of available data produced from various sources with different topics. The availability of this enormous data necessitates us to adopt effective computational tools to explore the data. This leads to an intense growing interest in the research community to develop computational methods focused on processing this text data. A line of study focused on condensing the text so that we are able to get a higher level of understanding in a shorter time. The two important tasks to do this are keyword extraction and text summarization. In keyword extraction, we are interested in finding the key important words from a text. This makes us familiar with the general topic of a text. In text summarization, we are interested in producing a short-length text which includes important information about the document. The TextRank algorithm, an unsupervised learning method that is an extension of the PageRank (algorithm which is the base algorithm of Google search engine for searching pages and ranking them), has shown its efficacy in large-scale text mining, especially for text summarization and keyword extraction. This algorithm can automatically extract the important parts of a text (keywords or sentences) and declare them as a result. However, this algorithm neglects the semantic similarity between the different parts. In this work, we improved the results of the TextRank algorithm by incorporating the semantic similarity between parts of the text. Aside from keyword extraction and text summarization, we develop a topic clustering algorithm based on our framework, which can be used individually or as a part of generating the summary to overcome coverage problems.

Keywords: keyword extraction, n-gram extraction, text summarization, topic clustering, semantic analysis

Procedia PDF Downloads 46
1847 Data Clustering Algorithm Based on Multi-Objective Periodic Bacterial Foraging Optimization with Two Learning Archives

Authors: Chen Guo, Heng Tang, Ben Niu

Abstract:

Clustering splits objects into different groups based on similarity, making the objects have higher similarity in the same group and lower similarity in different groups. Thus, clustering can be treated as an optimization problem to maximize the intra-cluster similarity or inter-cluster dissimilarity. In real-world applications, the datasets often have some complex characteristics: sparse, overlap, high dimensionality, etc. When facing these datasets, simultaneously optimizing two or more objectives can obtain better clustering results than optimizing one objective. However, except for the objectives weighting methods, traditional clustering approaches have difficulty in solving multi-objective data clustering problems. Due to this, evolutionary multi-objective optimization algorithms are investigated by researchers to optimize multiple clustering objectives. In this paper, the Data Clustering algorithm based on Multi-objective Periodic Bacterial Foraging Optimization with two Learning Archives (DC-MPBFOLA) is proposed. Specifically, first, to reduce the high computing complexity of the original BFO, periodic BFO is employed as the basic algorithmic framework. Then transfer the periodic BFO into a multi-objective type. Second, two learning strategies are proposed based on the two learning archives to guide the bacterial swarm to move in a better direction. On the one hand, the global best is selected from the global learning archive according to the convergence index and diversity index. On the other hand, the personal best is selected from the personal learning archive according to the sum of weighted objectives. According to the aforementioned learning strategies, a chemotaxis operation is designed. Third, an elite learning strategy is designed to provide fresh power to the objects in two learning archives. When the objects in these two archives do not change for two consecutive times, randomly initializing one dimension of objects can prevent the proposed algorithm from falling into local optima. Fourth, to validate the performance of the proposed algorithm, DC-MPBFOLA is compared with four state-of-art evolutionary multi-objective optimization algorithms and one classical clustering algorithm on evaluation indexes of datasets. To further verify the effectiveness and feasibility of designed strategies in DC-MPBFOLA, variants of DC-MPBFOLA are also proposed. Experimental results demonstrate that DC-MPBFOLA outperforms its competitors regarding all evaluation indexes and clustering partitions. These results also indicate that the designed strategies positively influence the performance improvement of the original BFO.

Keywords: data clustering, multi-objective optimization, bacterial foraging optimization, learning archives

Procedia PDF Downloads 113
1846 Review: Wavelet New Tool for Path Loss Prediction

Authors: Danladi Ali, Abdullahi Mukaila

Abstract:

In this work, GSM signal strength (power) was monitored in an indoor environment. Samples of the GSM signal strength was measured on mobile equipment (ME). One-dimensional multilevel wavelet is used to predict the fading phenomenon of the GSM signal measured and neural network clustering to determine the average power received in the study area. The wavelet prediction revealed that the GSM signal is attenuated due to the fast fading phenomenon which fades about 7 times faster than the radio wavelength while the neural network clustering determined that -75dBm appeared more frequently followed by -85dBm. The work revealed that significant part of the signal measured is dominated by weak signal and the signal followed more of Rayleigh than Gaussian distribution. This confirmed the wavelet prediction.

Keywords: decomposition, clustering, propagation, model, wavelet, signal strength and spectral efficiency

Procedia PDF Downloads 425
1845 Constructing Orthogonal De Bruijn and Kautz Sequences and Applications

Authors: Yaw-Ling Lin

Abstract:

A de Bruijn graph of order k is a graph whose vertices representing all length-k sequences with edges joining pairs of vertices whose sequences have maximum possible overlap (length k−1). Every Hamiltonian cycle of this graph defines a distinct, minimum length de Bruijn sequence containing all k-mers exactly once. A Kautz sequence is the minimal generating sequence so as the sequence of minimal length that produces all possible length-k sequences with the restriction that every two consecutive alphabets in the sequences must be different. A collection of de Bruijn/Kautz sequences are orthogonal if any two sequences are of maximally differ in sequence composition; that is, the maximum length of their common substring is k. In this paper, we discuss how such a collection of (maximal) orthogonal de Bruijn/Kautz sequences can be made and use the algorithm to build up a web application service for the synthesized DNA and other related biomolecular sequences.

Keywords: biomolecular sequence synthesis, de Bruijn sequences, Eulerian cycle, Hamiltonian cycle, Kautz sequences, orthogonal sequences

Procedia PDF Downloads 131
1844 Graph Neural Network-Based Classification for Disease Prediction in Health Care Heterogeneous Data Structures of Electronic Health Record

Authors: Raghavi C. Janaswamy

Abstract:

In the healthcare sector, heterogenous data elements such as patients, diagnosis, symptoms, conditions, observation text from physician notes, and prescriptions form the essentials of the Electronic Health Record (EHR). The data in the form of clear text and images are stored or processed in a relational format in most systems. However, the intrinsic structure restrictions and complex joins of relational databases limit the widespread utility. In this regard, the design and development of realistic mapping and deep connections as real-time objects offer unparallel advantages. Herein, a graph neural network-based classification of EHR data has been developed. The patient conditions have been predicted as a node classification task using a graph-based open source EHR data, Synthea Database, stored in Tigergraph. The Synthea DB dataset is leveraged due to its closer representation of the real-time data and being voluminous. The graph model is built from the EHR heterogeneous data using python modules, namely, pyTigerGraph to get nodes and edges from the Tigergraph database, PyTorch to tensorize the nodes and edges, PyTorch-Geometric (PyG) to train the Graph Neural Network (GNN) and adopt the self-supervised learning techniques with the AutoEncoders to generate the node embeddings and eventually perform the node classifications using the node embeddings. The model predicts patient conditions ranging from common to rare situations. The outcome is deemed to open up opportunities for data querying toward better predictions and accuracy.

Keywords: electronic health record, graph neural network, heterogeneous data, prediction

Procedia PDF Downloads 63
1843 Improved Color-Based K-Mean Algorithm for Clustering of Satellite Image

Authors: Sangeeta Yadav, Mantosh Biswas

Abstract:

In this paper, we proposed an improved color based K-mean algorithm for clustering of satellite Image (SAR). Our method comprises of two stages. The first step is an interactive selection process where users are required to input the number of colors (ncolor), number of clusters, and then they are prompted to select the points in each color cluster. In the second step these points are given as input to K-mean clustering algorithm that clusters the image based on color and Minimum Square Euclidean distance. The proposed method reduces the mixed pixel problem to a great extent.

Keywords: cluster, ncolor method, K-mean method, interactive selection process

Procedia PDF Downloads 263
1842 Aspect-Level Sentiment Analysis with Multi-Channel and Graph Convolutional Networks

Authors: Jiajun Wang, Xiaoge Li

Abstract:

The purpose of the aspect-level sentiment analysis task is to identify the sentiment polarity of aspects in a sentence. Currently, most methods mainly focus on using neural networks and attention mechanisms to model the relationship between aspects and context, but they ignore the dependence of words in different ranges in the sentence, resulting in deviation when assigning relationship weight to other words other than aspect words. To solve these problems, we propose a new aspect-level sentiment analysis model that combines a multi-channel convolutional network and graph convolutional network (GCN). Firstly, the context and the degree of association between words are characterized by Long Short-Term Memory (LSTM) and self-attention mechanism. Besides, a multi-channel convolutional network is used to extract the features of words in different ranges. Finally, a convolutional graph network is used to associate the node information of the dependency tree structure. We conduct experiments on four benchmark datasets. The experimental results are compared with those of other models, which shows that our model is better and more effective.

Keywords: aspect-level sentiment analysis, attention, multi-channel convolution network, graph convolution network, dependency tree

Procedia PDF Downloads 176
1841 Surface to the Deeper: A Universal Entity Alignment Approach Focusing on Surface Information

Authors: Zheng Baichuan, Li Shenghui, Li Bingqian, Zhang Ning, Chen Kai

Abstract:

Entity alignment (EA) tasks in knowledge graphs often play a pivotal role in the integration of knowledge graphs, where structural differences often exist between the source and target graphs, such as the presence or absence of attribute information and the types of attribute information (text, timestamps, images, etc.). However, most current research efforts are focused on improving alignment accuracy, often along with an increased reliance on specific structures -a dependency that inevitably diminishes their practical value and causes difficulties when facing knowledge graph alignment tasks with varying structures. Therefore, we propose a universal knowledge graph alignment approach that only utilizes the common basic structures shared by knowledge graphs. We have demonstrated through experiments that our method achieves state-of-the-art performance in fair comparisons.

Keywords: knowledge graph, entity alignment, transformer, deep learning

Procedia PDF Downloads 17
1840 Robust Diagnosability of PEMFC Based on Bond Graph LFT

Authors: Ould Bouamama, M. Bressel, D. Hissel, M. Hilairet

Abstract:

Fuel cell (FC) is one of the best alternatives of fossil energy. Recently, the research community of fuel cell has shown a considerable interest for diagnosis in view to ensure safety, security, and availability when faults occur in the process. The problematic for model based FC diagnosis consists in that the model is complex because of coupling of several kind of energies and the numerical values of parameters are not always known or are uncertain. The present paper deals with use of one tool: the Linear Fractional Transformation bond graph tool not only for uncertain modelling but also for monitorability (ability to detect and isolate faults) analysis and formal generation of robust fault indicators with respect to parameter uncertainties.The developed theory applied to a nonlinear FC system has proved its efficiency.

Keywords: bond graph, fuel cell, fault detection and isolation (FDI), robust diagnosis, structural analysis

Procedia PDF Downloads 341
1839 Issue Reorganization Using the Measure of Relevance

Authors: William Wong Xiu Shun, Yoonjin Hyun, Mingyu Kim, Seongi Choi, Namgyu Kim

Abstract:

Recently, the demand of extracting the R&D keywords from the issues and using them in retrieving R&D information is increasing rapidly. But it is hard to identify the related issues or to distinguish them. Although the similarity between the issues cannot be identified, but with the R&D lexicon, the issues that always shared the same R&D keywords can be determined. In details, the R&D keywords that associated with particular issue is implied the key technology elements that needed to solve the problem of the particular issue. Furthermore, the related issues that sharing the same R&D keywords can be showed in a more systematic way through the issue clustering constructed from the perspective of R&D. Thus, sharing of the R&D result and reusable of the R&D technology can be facilitated. Indirectly, the redundancy of investment on the same R&D can be reduce as the R&D information can be shared between those corresponding issues and reusability of the related R&D can be improved. Therefore, a methodology of constructing an issue clustering from the perspective of common R&D keywords is proposed to satisfy the demands mentioned.

Keywords: clustering, social network analysis, text mining, topic analysis

Procedia PDF Downloads 551
1838 A Hybrid Method for Determination of Effective Poles Using Clustering Dominant Pole Algorithm

Authors: Anuj Abraham, N. Pappa, Daniel Honc, Rahul Sharma

Abstract:

In this paper, an analysis of some model order reduction techniques is presented. A new hybrid algorithm for model order reduction of linear time invariant systems is compared with the conventional techniques namely Balanced Truncation, Hankel Norm reduction and Dominant Pole Algorithm (DPA). The proposed hybrid algorithm is known as Clustering Dominant Pole Algorithm (CDPA) is able to compute the full set of dominant poles and its cluster center efficiently. The dominant poles of a transfer function are specific eigenvalues of the state space matrix of the corresponding dynamical system. The effectiveness of this novel technique is shown through the simulation results.

Keywords: balanced truncation, clustering, dominant pole, Hankel norm, model reduction

Procedia PDF Downloads 577
1837 A Neural Network Based Clustering Approach for Imputing Multivariate Values in Big Data

Authors: S. Nickolas, Shobha K.

Abstract:

The treatment of incomplete data is an important step in the data pre-processing. Missing values creates a noisy environment in all applications and it is an unavoidable problem in big data management and analysis. Numerous techniques likes discarding rows with missing values, mean imputation, expectation maximization, neural networks with evolutionary algorithms or optimized techniques and hot deck imputation have been introduced by researchers for handling missing data. Among these, imputation techniques plays a positive role in filling missing values when it is necessary to use all records in the data and not to discard records with missing values. In this paper we propose a novel artificial neural network based clustering algorithm, Adaptive Resonance Theory-2(ART2) for imputation of missing values in mixed attribute data sets. The process of ART2 can recognize learned models fast and be adapted to new objects rapidly. It carries out model-based clustering by using competitive learning and self-steady mechanism in dynamic environment without supervision. The proposed approach not only imputes the missing values but also provides information about handling the outliers.

Keywords: ART2, data imputation, clustering, missing data, neural network, pre-processing

Procedia PDF Downloads 249
1836 Structure Clustering for Milestoning Applications of Complex Conformational Transitions

Authors: Amani Tahat, Serdal Kirmizialtin

Abstract:

Trajectory fragment methods such as Markov State Models (MSM), Milestoning (MS) and Transition Path sampling are the prime choice of extending the timescale of all atom Molecular Dynamics simulations. In these approaches, a set of structures that covers the accessible phase space has to be chosen a priori using cluster analysis. Structural clustering serves to partition the conformational state into natural subgroups based on their similarity, an essential statistical methodology that is used for analyzing numerous sets of empirical data produced by Molecular Dynamics (MD) simulations. Local transition kernel among these clusters later used to connect the metastable states using a Markovian kinetic model in MSM and a non-Markovian model in MS. The choice of clustering approach in constructing such kernel is crucial since the high dimensionality of the biomolecular structures might easily confuse the identification of clusters when using the traditional hierarchical clustering methodology. Of particular interest, in the case of MS where the milestones are very close to each other, accurate determination of the milestone identity of the trajectory becomes a challenging issue. Throughout this work we present two cluster analysis methods applied to the cis–trans isomerism of dinucleotide AA. The choice of nucleic acids to commonly used proteins to study the cluster analysis is two fold: i) the energy landscape is rugged; hence transitions are more complex, enabling a more realistic model to study conformational transitions, ii) Nucleic acids conformational space is high dimensional. A diverse set of internal coordinates is necessary to describe the metastable states in nucleic acids, posing a challenge in studying the conformational transitions. Herein, we need improved clustering methods that accurately identify the AA structure in its metastable states in a robust way for a wide range of confused data conditions. The single linkage approach of the hierarchical clustering available in GROMACS MD-package is the first clustering methodology applied to our data. Self Organizing Map (SOM) neural network, that also known as a Kohonen network, is the second data clustering methodology. The performance comparison of the neural network as well as hierarchical clustering method is studied by means of computing the mean first passage times for the cis-trans conformational rates. Our hope is that this study provides insight into the complexities and need in determining the appropriate clustering algorithm for kinetic analysis. Our results can improve the effectiveness of decisions based on clustering confused empirical data in studying conformational transitions in biomolecules.

Keywords: milestoning, self organizing map, single linkage, structure clustering

Procedia PDF Downloads 198