Search results for: time series clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 19721

Search results for: time series clustering

19601 An Analysis on Clustering Based Gene Selection and Classification for Gene Expression Data

Authors: K. Sathishkumar, V. Thiagarasu

Abstract:

Due to recent advances in DNA microarray technology, it is now feasible to obtain gene expression profiles of tissue samples at relatively low costs. Many scientists around the world use the advantage of this gene profiling to characterize complex biological circumstances and diseases. Microarray techniques that are used in genome-wide gene expression and genome mutation analysis help scientists and physicians in understanding of the pathophysiological mechanisms, in diagnoses and prognoses, and choosing treatment plans. DNA microarray technology has now made it possible to simultaneously monitor the expression levels of thousands of genes during important biological processes and across collections of related samples. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. A first step toward addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. This work presents an analysis of several clustering algorithms proposed to deals with the gene expression data effectively. The existing clustering algorithms like Support Vector Machine (SVM), K-means algorithm and evolutionary algorithm etc. are analyzed thoroughly to identify the advantages and limitations. The performance evaluation of the existing algorithms is carried out to determine the best approach. In order to improve the classification performance of the best approach in terms of Accuracy, Convergence Behavior and processing time, a hybrid clustering based optimization approach has been proposed.

Keywords: microarray technology, gene expression data, clustering, gene Selection

Procedia PDF Downloads 295
19600 Human Digital Twin for Personal Conversation Automation Using Supervised Machine Learning Approaches

Authors: Aya Salama

Abstract:

Digital Twin is an emerging research topic that attracted researchers in the last decade. It is used in many fields, such as smart manufacturing and smart healthcare because it saves time and money. It is usually related to other technologies such as Data Mining, Artificial Intelligence, and Machine Learning. However, Human digital twin (HDT), in specific, is still a novel idea that still needs to prove its feasibility. HDT expands the idea of Digital Twin to human beings, which are living beings and different from the inanimate physical entities. The goal of this research was to create a Human digital twin that is responsible for real-time human replies automation by simulating human behavior. For this reason, clustering, supervised classification, topic extraction, and sentiment analysis were studied in this paper. The feasibility of the HDT for personal replies generation on social messaging applications was proved in this work. The overall accuracy of the proposed approach in this paper was 63% which is a very promising result that can open the way for researchers to expand the idea of HDT. This was achieved by using Random Forest for clustering the question data base and matching new questions. K-nearest neighbor was also applied for sentiment analysis.

Keywords: human digital twin, sentiment analysis, topic extraction, supervised machine learning, unsupervised machine learning, classification, clustering

Procedia PDF Downloads 65
19599 Data Mining Spatial: Unsupervised Classification of Geographic Data

Authors: Chahrazed Zouaoui

Abstract:

In recent years, the volume of geospatial information is increasing due to the evolution of communication technologies and information, this information is presented often by geographic information systems (GIS) and stored on of spatial databases (BDS). The classical data mining revealed a weakness in knowledge extraction at these enormous amounts of data due to the particularity of these spatial entities, which are characterized by the interdependence between them (1st law of geography). This gave rise to spatial data mining. Spatial data mining is a process of analyzing geographic data, which allows the extraction of knowledge and spatial relationships from geospatial data, including methods of this process we distinguish the monothematic and thematic, geo- Clustering is one of the main tasks of spatial data mining, which is registered in the part of the monothematic method. It includes geo-spatial entities similar in the same class and it affects more dissimilar to the different classes. In other words, maximize intra-class similarity and minimize inter similarity classes. Taking account of the particularity of geo-spatial data. Two approaches to geo-clustering exist, the dynamic processing of data involves applying algorithms designed for the direct treatment of spatial data, and the approach based on the spatial data pre-processing, which consists of applying clustering algorithms classic pre-processed data (by integration of spatial relationships). This approach (based on pre-treatment) is quite complex in different cases, so the search for approximate solutions involves the use of approximation algorithms, including the algorithms we are interested in dedicated approaches (clustering methods for partitioning and methods for density) and approaching bees (biomimetic approach), our study is proposed to design very significant to this problem, using different algorithms for automatically detecting geo-spatial neighborhood in order to implement the method of geo- clustering by pre-treatment, and the application of the bees algorithm to this problem for the first time in the field of geo-spatial.

Keywords: mining, GIS, geo-clustering, neighborhood

Procedia PDF Downloads 359
19598 A Review of Different Studies on Hidden Markov Models for Multi-Temporal Satellite Images: Stationarity and Non-Stationarity Issues

Authors: Ali Ben Abbes, Imed Riadh Farah

Abstract:

Due to the considerable advances in Multi-Temporal Satellite Images (MTSI), remote sensing application became more accurate. Recently, many advances in modeling MTSI are developed using various models. The purpose of this article is to present an overview of studies using Hidden Markov Model (HMM). First of all, we provide a background of using HMM and their applications in this context. A comparison of the different works is discussed, and possible areas and challenges are highlighted. Secondly, we discussed the difference on vegetation monitoring as well as urban growth. Nevertheless, most research efforts have been used only stationary data. From another point of view, in this paper, we describe a new non-stationarity HMM, that is defined with a set of parts of the time series e.g. seasonal, trend and random. In addition, a new approach giving more accurate results and improve the applicability of the HMM in modeling a non-stationary data series. In order to assess the performance of the HMM, different experiments are carried out using Moderate Resolution Imaging Spectroradiometer (MODIS) NDVI time series of the northwestern region of Tunisia and Landsat time series of tres Cantos-Madrid in Spain.

Keywords: multi-temporal satellite image, HMM , nonstationarity, vegetation, urban

Procedia PDF Downloads 326
19597 Feature Weighting Comparison Based on Clustering Centers in the Detection of Diabetic Retinopathy

Authors: Kemal Polat

Abstract:

In this paper, three feature weighting methods have been used to improve the classification performance of diabetic retinopathy (DR). To classify the diabetic retinopathy, features extracted from the output of several retinal image processing algorithms, such as image-level, lesion-specific and anatomical components, have been used and fed them into the classifier algorithms. The dataset used in this study has been taken from University of California, Irvine (UCI) machine learning repository. Feature weighting methods including the fuzzy c-means clustering based feature weighting, subtractive clustering based feature weighting, and Gaussian mixture clustering based feature weighting, have been used and compered with each other in the classification of DR. After feature weighting, five different classifier algorithms comprising multi-layer perceptron (MLP), k- nearest neighbor (k-NN), decision tree, support vector machine (SVM), and Naïve Bayes have been used. The hybrid method based on combination of subtractive clustering based feature weighting and decision tree classifier has been obtained the classification accuracy of 100% in the screening of DR. These results have demonstrated that the proposed hybrid scheme is very promising in the medical data set classification.

Keywords: machine learning, data weighting, classification, data mining

Procedia PDF Downloads 305
19596 Clustering Color Space, Time Interest Points for Moving Objects

Authors: Insaf Bellamine, Hamid Tairi

Abstract:

Detecting moving objects in sequences is an essential step for video analysis. This paper mainly contributes to the Color Space-Time Interest Points (CSTIP) extraction and detection. We propose a new method for detection of moving objects. Two main steps compose the proposed method. First, we suggest to apply the algorithm of the detection of Color Space-Time Interest Points (CSTIP) on both components of the Color Structure-Texture Image Decomposition which is based on a Partial Differential Equation (PDE): a color geometric structure component and a color texture component. A descriptor is associated to each of these points. In a second stage, we address the problem of grouping the points (CSTIP) into clusters. Experiments and comparison to other motion detection methods on challenging sequences show the performance of the proposed method and its utility for video analysis. Experimental results are obtained from very different types of videos, namely sport videos and animation movies.

Keywords: Color Space-Time Interest Points (CSTIP), Color Structure-Texture Image Decomposition, Motion Detection, clustering

Procedia PDF Downloads 355
19595 R Software for Parameter Estimation of Spatio-Temporal Model

Authors: Budi Nurani Ruchjana, Atje Setiawan Abdullah, I. Gede Nyoman Mindra Jaya, Eddy Hermawan

Abstract:

In this paper, we propose the application package to estimate parameters of spatiotemporal model based on the multivariate time series analysis using the R open-source software. We build packages mainly to estimate the parameters of the Generalized Space Time Autoregressive (GSTAR) model. GSTAR is a combination of time series and spatial models that have parameters vary per location. We use the method of Ordinary Least Squares (OLS) and use the Mean Average Percentage Error (MAPE) to fit the model to spatiotemporal real phenomenon. For case study, we use oil production data from volcanic layer at Jatibarang Indonesia or climate data such as rainfall in Indonesia. Software R is very user-friendly and it is making calculation easier, processing the data is accurate and faster. Limitations R script for the estimation of model parameters spatiotemporal GSTAR built is still limited to a stationary time series model. Therefore, the R program under windows can be developed either for theoretical studies and application.

Keywords: GSTAR Model, MAPE, OLS method, oil production, R software

Procedia PDF Downloads 217
19594 Clustering-Based Computational Workload Minimization in Ontology Matching

Authors: Mansir Abubakar, Hazlina Hamdan, Norwati Mustapha, Teh Noranis Mohd Aris

Abstract:

In order to build a matching pattern for each class correspondences of ontology, it is required to specify a set of attribute correspondences across two corresponding classes by clustering. Clustering reduces the size of potential attribute correspondences considered in the matching activity, which will significantly reduce the computation workload; otherwise, all attributes of a class should be compared with all attributes of the corresponding class. Most existing ontology matching approaches lack scalable attributes discovery methods, such as cluster-based attribute searching. This problem makes ontology matching activity computationally expensive. It is therefore vital in ontology matching to design a scalable element or attribute correspondence discovery method that would reduce the size of potential elements correspondences during mapping thereby reduce the computational workload in a matching process as a whole. The objective of this work is 1) to design a clustering method for discovering similar attributes correspondences and relationships between ontologies, 2) to discover element correspondences by classifying elements of each class based on element’s value features using K-medoids clustering technique. Discovering attribute correspondence is highly required for comparing instances when matching two ontologies. During the matching process, any two instances across two different data sets should be compared to their attribute values, so that they can be regarded to be the same or not. Intuitively, any two instances that come from classes across which there is a class correspondence are likely to be identical to each other. Besides, any two instances that hold more similar attribute values are more likely to be matched than the ones with less similar attribute values. Most of the time, similar attribute values exist in the two instances across which there is an attribute correspondence. This work will present how to classify attributes of each class with K-medoids clustering, then, clustered groups to be mapped by their statistical value features. We will also show how to map attributes of a clustered group to attributes of the mapped clustered group, generating a set of potential attribute correspondences that would be applied to generate a matching pattern. The K-medoids clustering phase would largely reduce the number of attribute pairs that are not corresponding for comparing instances as only the coverage probability of attributes pairs that reaches 100% and attributes above the specified threshold can be considered as potential attributes for a matching. Using clustering will reduce the size of potential elements correspondences to be considered during mapping activity, which will in turn reduce the computational workload significantly. Otherwise, all element of the class in source ontology have to be compared with all elements of the corresponding classes in target ontology. K-medoids can ably cluster attributes of each class, so that a proportion of attribute pairs that are not corresponding would not be considered when constructing the matching pattern.

Keywords: attribute correspondence, clustering, computational workload, k-medoids clustering, ontology matching

Procedia PDF Downloads 224
19593 Time Series Modelling and Prediction of River Runoff: Case Study of Karkheh River, Iran

Authors: Karim Hamidi Machekposhti, Hossein Sedghi, Abdolrasoul Telvari, Hossein Babazadeh

Abstract:

Rainfall and runoff phenomenon is a chaotic and complex outcome of nature which requires sophisticated modelling and simulation methods for explanation and use. Time Series modelling allows runoff data analysis and can be used as forecasting tool. In the paper attempt is made to model river runoff data and predict the future behavioural pattern of river based on annual past observations of annual river runoff. The river runoff analysis and predict are done using ARIMA model. For evaluating the efficiency of prediction to hydrological events such as rainfall, runoff and etc., we use the statistical formulae applicable. The good agreement between predicted and observation river runoff coefficient of determination (R2) display that the ARIMA (4,1,1) is the suitable model for predicting Karkheh River runoff at Iran.

Keywords: time series modelling, ARIMA model, river runoff, Karkheh River, CLS method

Procedia PDF Downloads 314
19592 Clustering the Wheat Seeds Using SOM Artificial Neural Networks

Authors: Salah Ghamari

Abstract:

In this study, the ability of self organizing map artificial (SOM) neural networks in clustering the wheat seeds varieties according to morphological properties of them was considered. The SOM is one type of unsupervised competitive learning. Experimentally, five morphological features of 300 seeds (including three varieties: gaskozhen, Md and sardari) were obtained using image processing technique. The results show that the artificial neural network has a good performance (90.33% accuracy) in classification of the wheat varieties despite of high similarity in them. The highest classification accuracy (100%) was achieved for sardari.

Keywords: artificial neural networks, clustering, self organizing map, wheat variety

Procedia PDF Downloads 618
19591 GCM Based Fuzzy Clustering to Identify Homogeneous Climatic Regions of North-East India

Authors: Arup K. Sarma, Jayshree Hazarika

Abstract:

The North-eastern part of India, which receives heavier rainfall than other parts of the subcontinent, is of great concern now-a-days with regard to climate change. High intensity rainfall for short duration and longer dry spell, occurring due to impact of climate change, affects river morphology too. In the present study, an attempt is made to delineate the North-Eastern region of India into some homogeneous clusters based on the Fuzzy Clustering concept and to compare the resulting clusters obtained by using conventional methods and non conventional methods of clustering. The concept of clustering is adapted in view of the fact that, impact of climate change can be studied in a homogeneous region without much variation, which can be helpful in studies related to water resources planning and management. 10 IMD (Indian Meteorological Department) stations, situated in various regions of the North-east, have been selected for making the clusters. The results of the Fuzzy C-Means (FCM) analysis show different clustering patterns for different conditions. From the analysis and comparison it can be concluded that non conventional method of using GCM data is somehow giving better results than the others. However, further analysis can be done by taking daily data instead of monthly means to reduce the effect of standardization.

Keywords: climate change, conventional and nonconventional methods of clustering, FCM analysis, homogeneous regions

Procedia PDF Downloads 359
19590 The Clustering of Multiple Sclerosis Subgroups through L2 Norm Multifractal Denoising Technique

Authors: Yeliz Karaca, Rana Karabudak

Abstract:

Multifractal Denoising techniques are used in the identification of significant attributes by removing the noise of the dataset. Magnetic resonance (MR) image technique is the most sensitive method so as to identify chronic disorders of the nervous system such as Multiple Sclerosis. MRI and Expanded Disability Status Scale (EDSS) data belonging to 120 individuals who have one of the subgroups of MS (Relapsing Remitting MS (RRMS), Secondary Progressive MS (SPMS), Primary Progressive MS (PPMS)) as well as 19 healthy individuals in the control group have been used in this study. The study is comprised of the following stages: (i) L2 Norm Multifractal Denoising technique, one of the multifractal technique, has been used with the application on the MS data (MRI and EDSS). In this way, the new dataset has been obtained. (ii) The new MS dataset obtained from the MS dataset and L2 Multifractal Denoising technique has been applied to the K-Means and Fuzzy C Means clustering algorithms which are among the unsupervised methods. Thus, the clustering performances have been compared. (iii) In the identification of significant attributes in the MS dataset through the Multifractal denoising (L2 Norm) technique using K-Means and FCM algorithms on the MS subgroups and control group of healthy individuals, excellent performance outcome has been yielded. According to the clustering results based on the MS subgroups obtained in the study, successful clustering results have been obtained in the K-Means and FCM algorithms by applying the L2 norm of multifractal denoising technique for the MS dataset. Clustering performance has been more successful with the MS Dataset (L2_Norm MS Data Set) K-Means and FCM in which significant attributes are obtained by applying L2 Norm Denoising technique.

Keywords: clinical decision support, clustering algorithms, multiple sclerosis, multifractal techniques

Procedia PDF Downloads 141
19589 Enhancing Patch Time Series Transformer with Wavelet Transform for Improved Stock Prediction

Authors: Cheng-yu Hsieh, Bo Zhang, Ahmed Hambaba

Abstract:

Stock market prediction has long been an area of interest for both expert analysts and investors, driven by its complexity and the noisy, volatile conditions it operates under. This research examines the efficacy of combining the Patch Time Series Transformer (PatchTST) with wavelet transforms, specifically focusing on Haar and Daubechies wavelets, in forecasting the adjusted closing price of the S&P 500 index for the following day. By comparing the performance of the augmented PatchTST models with traditional predictive models such as Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, and Transformers, this study highlights significant enhancements in prediction accuracy. The integration of the Daubechies wavelet with PatchTST notably excels, surpassing other configurations and conventional models in terms of Mean Absolute Error (MAE) and Mean Squared Error (MSE). The success of the PatchTST model paired with Daubechies wavelet is attributed to its superior capability in extracting detailed signal information and eliminating irrelevant noise, thus proving to be an effective approach for financial time series forecasting.

Keywords: deep learning, financial forecasting, stock market prediction, patch time series transformer, wavelet transform

Procedia PDF Downloads 2
19588 A Similarity Measure for Classification and Clustering in Image Based Medical and Text Based Banking Applications

Authors: K. P. Sandesh, M. H. Suman

Abstract:

Text processing plays an important role in information retrieval, data-mining, and web search. Measuring the similarity between the documents is an important operation in the text processing field. In this project, a new similarity measure is proposed. To compute the similarity between two documents with respect to a feature the proposed measure takes the following three cases into account: (1) The feature appears in both documents; (2) The feature appears in only one document and; (3) The feature appears in none of the documents. The proposed measure is extended to gauge the similarity between two sets of documents. The effectiveness of our measure is evaluated on several real-world data sets for text classification and clustering problems, especially in banking and health sectors. The results show that the performance obtained by the proposed measure is better than that achieved by the other measures.

Keywords: document classification, document clustering, entropy, accuracy, classifiers, clustering algorithms

Procedia PDF Downloads 489
19587 Series "H154M" as a Unit Area of the Region between the Lines and Curves

Authors: Hisyam Hidayatullah

Abstract:

This world events consciously or not realize everything has a pattern, until the events of the universe according to the Big Bang theory of the solar system which makes so regular in the rotation. The author would like to create a results curve area between the quadratic function y=kx2 and line y=ka2 using GeoGebra application version 4.2. This paper can provide a series that is no less interesting with Fourier series, so that will add new material about the series can be calculated with sigma notation. In addition, the ranks of the unique natural numbers of extensive changes in established areas. Finally, this paper provides analytical and geometric proof of the vast area in between the lines and curves that give the area is formed by y=ka2 dan kurva y=kx2, x-axis, line x=√a and x=-√a make a series of numbers for k=1 and a ∈ original numbers. ∑_(i=0)^n=(4n√n)/3=0+4/3+(8√2)/3+4√3+⋯+(4n√n)/3. The author calls the series “H154M”.

Keywords: sequence, series, sigma notation, application GeoGebra

Procedia PDF Downloads 347
19586 Application of Fuzzy Clustering on Classification Agile Supply Chain

Authors: Hamidreza Fallah Lajimi , Elham Karami, Fatemeh Ali nasab, Mostafa Mahdavikia

Abstract:

Being responsive is an increasingly important skill for firms in today’s global economy; thus firms must be agile. Naturally, it follows that an organization’s agility depends on its supply chain being agile. However, achieving supply chain agility is a function of other abilities within the organization. This paper analyses results from a survey of 71 Iran manufacturing companies in order to identify some of the factors for agile organizations in managing their supply chains. Then we classification this company in four cluster with fuzzy c-mean technique and with four validations functional determine automatically the optimal number of clusters.

Keywords: agile supply chain, clustering, fuzzy clustering

Procedia PDF Downloads 430
19585 Sustainability and Clustering: A Bibliometric Assessment

Authors: Fernanda M. Assef, Maria Teresinha A. Steiner, David Gabriel F. Barros

Abstract:

Review researches are useful in terms of analysis of research problems. Between the types of review documents, we commonly find bibliometric studies. This type of application often helps the global visualization of a research problem and helps academics worldwide to understand the context of a research area better. In this document, a bibliometric view surrounding clustering techniques and sustainability problems is presented. The authors aimed at which issues mostly use clustering techniques, and, even which sustainability issue would be more impactful on today’s moment of research. During the bibliometric analysis, we found ten different groups of research in clustering applications for sustainability issues: Energy; Environmental; Non-urban planning; Sustainable Development; Sustainable Supply Chain; Transport; Urban Planning; Water; Waste Disposal; and, Others. And, by analyzing the citations of each group, we discovered that the Environmental group could be classified as the most impactful research cluster in the area mentioned. Now, after the content analysis of each paper classified in the environmental group, we found that the k-means technique is preferred for solving sustainability problems with clustering methods since it appeared the most amongst the documents. The authors finally conclude that a bibliometric assessment could help indicate a gap of researches on waste disposal – which was the group with the least amount of publications – and the most impactful research on environmental problems.

Keywords: bibliometric assessment, clustering, sustainability, territorial partitioning

Procedia PDF Downloads 84
19584 Multi-Cluster Overlapping K-Means Extension Algorithm (MCOKE)

Authors: Said Baadel, Fadi Thabtah, Joan Lu

Abstract:

Clustering involves the partitioning of n objects into k clusters. Many clustering algorithms use hard-partitioning techniques where each object is assigned to one cluster. In this paper, we propose an overlapping algorithm MCOKE which allows objects to belong to one or more clusters. The algorithm is different from fuzzy clustering techniques because objects that overlap are assigned a membership value of 1 (one) as opposed to a fuzzy membership degree. The algorithm is also different from other overlapping algorithms that require a similarity threshold to be defined as a priority which can be difficult to determine by novice users.

Keywords: data mining, k-means, MCOKE, overlapping

Procedia PDF Downloads 540
19583 Design an Algorithm for Software Development in CBSE Envrionment Using Feed Forward Neural Network

Authors: Amit Verma, Pardeep Kaur

Abstract:

In software development organizations, Component based Software engineering (CBSE) is emerging paradigm for software development and gained wide acceptance as it often results in increase quality of software product within development time and budget. In component reusability, main challenges are the right component identification from large repositories at right time. The major objective of this work is to provide efficient algorithm for storage and effective retrieval of components using neural network and parameters based on user choice through clustering. This research paper aims to propose an algorithm that provides error free and automatic process (for retrieval of the components) while reuse of the component. In this algorithm, keywords (or components) are extracted from software document, after by applying k mean clustering algorithm. Then weights assigned to those keywords based on their frequency and after assigning weights, ANN predicts whether correct weight is assigned to keywords (or components) or not, otherwise it back propagates in to initial step (re-assign the weights). In last, store those all keywords into repositories for effective retrieval. Proposed algorithm is very effective in the error correction and detection with user base choice while choice of component for reusability for efficient retrieval is there.

Keywords: component based development, clustering, back propagation algorithm, keyword based retrieval

Procedia PDF Downloads 359
19582 A Clustering-Sequencing Approach to the Facility Layout Problem

Authors: Saeideh Salimpour, Sophie-Charlotte Viaux, Ahmed Azab, Mohammed Fazle Baki

Abstract:

The Facility Layout Problem (FLP) is key to the efficient and cost-effective operation of a system. This paper presents a hybrid heuristic- and mathematical-programming-based approach that divides the problem conceptually into those of clustering and sequencing. First, clusters of vertically aligned facilities are formed, which are later on sequenced horizontally. The developed methodology provides promising results in comparison to its counterparts in the literature by minimizing the inter-distances for facilities which have more interactions amongst each other and aims at placing the facilities with more interactions at the centroid of the shop.

Keywords: clustering-sequencing approach, mathematical modeling, optimization, unequal facility layout problem

Procedia PDF Downloads 309
19581 Geographic Legacies for Modern Day Disease Research: Autism Spectrum Disorder as a Case-Control Study

Authors: Rebecca Richards Steed, James Van Derslice, Ken Smith, Richard Medina, Amanda Bakian

Abstract:

Elucidating gene-environment interactions for heritable disease outcomes is an emerging area of disease research, with genetic studies informing hypotheses for environment and gene interactions underlying some of the most confounding diseases of our time, like autism spectrum disorder (ASD). Geography has thus far played a key role in identifying environmental factors contributing to disease, but its use can be broadened to include genetic and environmental factors that have a synergistic effect on disease. Through the use of family pedigrees and disease outcomes with life-course residential histories, space-time clustering of generations at critical developmental windows can provide further understanding of (1) environmental factors that contribute to disease patterns in families, (2) susceptible critical windows of development most impacted by environment, (3) and that are most likely to lead to an ASD diagnosis. This paper introduces a retrospective case-control study that utilizes pedigree data, health data, and residential life-course location points to find space-time clustering of ancestors with a grandchild/child with a clinical diagnosis of ASD. Finding space-time clusters of ancestors at critical developmental windows serves as a proxy for shared environmental exposures. The authors refer to geographic life-course exposures as geographic legacies. Identifying space-time clusters of ancestors creates a bridge for researching exposures of past generations that may impact modern-day progeny health. Results from the space-time cluster analysis show multiple clusters for the maternal and paternal pedigrees. The paternal grandparent pedigree resulted in the most space-time clustering for birth and childhood developmental windows. No statistically significant clustering was found for adolescent years. These results will be further studied to identify the specific share of space-time environmental exposures. In conclusion, this study has found significant space-time clusters of parents, and grandparents for both maternal and paternal lineage. These results will be used to identify what environmental exposures have been shared with family members at critical developmental windows of time, and additional analysis will be applied.

Keywords: family pedigree, environmental exposure, geographic legacy, medical geography, transgenerational inheritance

Procedia PDF Downloads 98
19580 Unsupervised Segmentation Technique for Acute Leukemia Cells Using Clustering Algorithms

Authors: N. H. Harun, A. S. Abdul Nasir, M. Y. Mashor, R. Hassan

Abstract:

Leukaemia is a blood cancer disease that contributes to the increment of mortality rate in Malaysia each year. There are two main categories for leukaemia, which are acute and chronic leukaemia. The production and development of acute leukaemia cells occurs rapidly and uncontrollable. Therefore, if the identification of acute leukaemia cells could be done fast and effectively, proper treatment and medicine could be delivered. Due to the requirement of prompt and accurate diagnosis of leukaemia, the current study has proposed unsupervised pixel segmentation based on clustering algorithm in order to obtain a fully segmented abnormal white blood cell (blast) in acute leukaemia image. In order to obtain the segmented blast, the current study proposed three clustering algorithms which are k-means, fuzzy c-means and moving k-means algorithms have been applied on the saturation component image. Then, median filter and seeded region growing area extraction algorithms have been applied, to smooth the region of segmented blast and to remove the large unwanted regions from the image, respectively. Comparisons among the three clustering algorithms are made in order to measure the performance of each clustering algorithm on segmenting the blast area. Based on the good sensitivity value that has been obtained, the results indicate that moving k-means clustering algorithm has successfully produced the fully segmented blast region in acute leukaemia image. Hence, indicating that the resultant images could be helpful to haematologists for further analysis of acute leukaemia.

Keywords: acute leukaemia images, clustering algorithms, image segmentation, moving k-means

Procedia PDF Downloads 263
19579 Radar on Bike: Coarse Classification based on Multi-Level Clustering for Cyclist Safety Enhancement

Authors: Asma Omri, Noureddine Benothman, Sofiane Sayahi, Fethi Tlili, Hichem Besbes

Abstract:

Cycling, a popular mode of transportation, can also be perilous due to cyclists' vulnerability to collisions with vehicles and obstacles. This paper presents an innovative cyclist safety system based on radar technology designed to offer real-time collision risk warnings to cyclists. The system incorporates a low-power radar sensor affixed to the bicycle and connected to a microcontroller. It leverages radar point cloud detections, a clustering algorithm, and a supervised classifier. These algorithms are optimized for efficiency to run on the TI’s AWR 1843 BOOST radar, utilizing a coarse classification approach distinguishing between cars, trucks, two-wheeled vehicles, and other objects. To enhance the performance of clustering techniques, we propose a 2-Level clustering approach. This approach builds on the state-of-the-art Density-based spatial clustering of applications with noise (DBSCAN). The objective is to first cluster objects based on their velocity, then refine the analysis by clustering based on position. The initial level identifies groups of objects with similar velocities and movement patterns. The subsequent level refines the analysis by considering the spatial distribution of these objects. The clusters obtained from the first level serve as input for the second level of clustering. Our proposed technique surpasses the classical DBSCAN algorithm in terms of geometrical metrics, including homogeneity, completeness, and V-score. Relevant cluster features are extracted and utilized to classify objects using an SVM classifier. Potential obstacles are identified based on their velocity and proximity to the cyclist. To optimize the system, we used the View of Delft dataset for hyperparameter selection and SVM classifier training. The system's performance was assessed using our collected dataset of radar point clouds synchronized with a camera on an Nvidia Jetson Nano board. The radar-based cyclist safety system is a practical solution that can be easily installed on any bicycle and connected to smartphones or other devices, offering real-time feedback and navigation assistance to cyclists. We conducted experiments to validate the system's feasibility, achieving an impressive 85% accuracy in the classification task. This system has the potential to significantly reduce the number of accidents involving cyclists and enhance their safety on the road.

Keywords: 2-level clustering, coarse classification, cyclist safety, warning system based on radar technology

Procedia PDF Downloads 54
19578 Ensuring Uniform Energy Consumption in Non-Deterministic Wireless Sensor Network to Protract Networks Lifetime

Authors: Vrince Vimal, Madhav J. Nigam

Abstract:

Wireless sensor networks have enticed much of the spotlight from researchers all around the world, owing to its extensive applicability in agricultural, industrial and military fields. Energy conservation node deployment stratagems play a notable role for active implementation of Wireless Sensor Networks. Clustering is the approach in wireless sensor networks which improves energy efficiency in the network. The clustering algorithm needs to have an optimum size and number of clusters, as clustering, if not implemented properly, cannot effectively increase the life of the network. In this paper, an algorithm has been proposed to address connectivity issues with the aim of ensuring the uniform energy consumption of nodes in every part of the network. The results obtained after simulation showed that the proposed algorithm has an edge over existing algorithms in terms of throughput and networks lifetime.

Keywords: Wireless Sensor network (WSN), Random Deployment, Clustering, Isolated Nodes, Networks Lifetime

Procedia PDF Downloads 313
19577 Statistical Time-Series and Neural Architecture of Malaria Patients Records in Lagos, Nigeria

Authors: Akinbo Razak Yinka, Adesanya Kehinde Kazeem, Oladokun Oluwagbenga Peter

Abstract:

Time series data are sequences of observations collected over a period of time. Such data can be used to predict health outcomes, such as disease progression, mortality, hospitalization, etc. The Statistical approach is based on mathematical models that capture the patterns and trends of the data, such as autocorrelation, seasonality, and noise, while Neural methods are based on artificial neural networks, which are computational models that mimic the structure and function of biological neurons. This paper compared both parametric and non-parametric time series models of patients treated for malaria in Maternal and Child Health Centres in Lagos State, Nigeria. The forecast methods considered linear regression, Integrated Moving Average, ARIMA and SARIMA Modeling for the parametric approach, while Multilayer Perceptron (MLP) and Long Short-Term Memory (LSTM) Network were used for the non-parametric model. The performance of each method is evaluated using the Mean Absolute Error (MAE), R-squared (R2) and Root Mean Square Error (RMSE) as criteria to determine the accuracy of each model. The study revealed that the best performance in terms of error was found in MLP, followed by the LSTM and ARIMA models. In addition, the Bootstrap Aggregating technique was used to make robust forecasts when there are uncertainties in the data.

Keywords: ARIMA, bootstrap aggregation, MLP, LSTM, SARIMA, time-series analysis

Procedia PDF Downloads 42
19576 Income-Consumption Relationships in Pakistan (1980-2011): A Cointegration Approach

Authors: Himayatullah Khan, Alena Fedorova

Abstract:

The present paper analyses the income-consumption relationships in Pakistan using annual time series data from 1980-81 to 2010-1. The paper uses the Augmented Dickey-Fuller test to check the unit root and stationarity in these two time series. The paper finds that the two time series are nonstationary but stationary at their first difference levels. The Augmented Engle-Granger test and the Cointegrating Regression Durbin-Watson test imply that the two time series of consumption and income are cointegrated and that long-run marginal propensity to consume is 0.88 which is given by the estimated (static) equilibrium relation. The paper also used the error correction mechanism to find out to model dynamic relationship. The purpose of the ECM is to indicate the speed of adjustment from the short-run equilibrium to the long-run equilibrium state. The results show that MPC is equal to 0.93 and is highly significant. The coefficient of Engle-Granger residuals is negative but insignificant. Statistically, the equilibrium error term is zero, which suggests that consumption adjusts to changes in GDP in the same period. The short-run changes in GDP have a positive impact on short-run changes in consumption. The paper concludes that we may interpret 0.93 as the short-run MPC. The pair-wise Granger Causality test shows that both GDP and consumption Granger cause each other.

Keywords: cointegrating regression, Augmented Dickey Fuller test, Augmented Engle-Granger test, Granger causality, error correction mechanism

Procedia PDF Downloads 390
19575 Wind Speed Data Analysis in Colombia in 2013 and 2015

Authors: Harold P. Villota, Alejandro Osorio B.

Abstract:

The energy meteorology is an area for study energy complementarity and the use of renewable sources in interconnected systems. Due to diversify the energy matrix in Colombia with wind sources, is necessary to know the data bases about this one. However, the time series given by 260 automatic weather stations have empty, and no apply data, so the purpose is to fill the time series selecting two years to characterize, impute and use like base to complete the data between 2005 and 2020.

Keywords: complementarity, wind speed, renewable, colombia, characteri, characterization, imputation

Procedia PDF Downloads 139
19574 An Efficient Clustering Technique for Copy-Paste Attack Detection

Authors: N. Chaitawittanun, M. Munlin

Abstract:

Due to rapid advancement of powerful image processing software, digital images are easy to manipulate and modify by ordinary people. Lots of digital images are edited for a specific purpose and more difficult to distinguish form their original ones. We propose a clustering method to detect a copy-move image forgery of JPEG, BMP, TIFF, and PNG. The process starts with reducing the color of the photos. Then, we use the clustering technique to divide information of measuring data by Hausdorff Distance. The result shows that the purposed methods is capable of inspecting the image file and correctly identify the forgery.

Keywords: image detection, forgery image, copy-paste, attack detection

Procedia PDF Downloads 309
19573 Event Driven Dynamic Clustering and Data Aggregation in Wireless Sensor Network

Authors: Ashok V. Sutagundar, Sunilkumar S. Manvi

Abstract:

Energy, delay and bandwidth are the prime issues of wireless sensor network (WSN). Energy usage optimization and efficient bandwidth utilization are important issues in WSN. Event triggered data aggregation facilitates such optimal tasks for event affected area in WSN. Reliable delivery of the critical information to sink node is also a major challenge of WSN. To tackle these issues, we propose an event driven dynamic clustering and data aggregation scheme for WSN that enhances the life time of the network by minimizing redundant data transmission. The proposed scheme operates as follows: (1) Whenever the event is triggered, event triggered node selects the cluster head. (2) Cluster head gathers data from sensor nodes within the cluster. (3) Cluster head node identifies and classifies the events out of the collected data using Bayesian classifier. (4) Aggregation of data is done using statistical method. (5) Cluster head discovers the paths to the sink node using residual energy, path distance and bandwidth. (6) If the aggregated data is critical, cluster head sends the aggregated data over the multipath for reliable data communication. (7) Otherwise aggregated data is transmitted towards sink node over the single path which is having the more bandwidth and residual energy. The performance of the scheme is validated for various WSN scenarios to evaluate the effectiveness of the proposed approach in terms of aggregation time, cluster formation time and energy consumed for aggregation.

Keywords: wireless sensor network, dynamic clustering, data aggregation, wireless communication

Procedia PDF Downloads 417
19572 Application of Fuzzy Clustering on Classification Agile Supply Chain Firms

Authors: Hamidreza Fallah Lajimi, Elham Karami, Alireza Arab, Fatemeh Alinasab

Abstract:

Being responsive is an increasingly important skill for firms in today’s global economy; thus firms must be agile. Naturally, it follows that an organization’s agility depends on its supply chain being agile. However, achieving supply chain agility is a function of other abilities within the organization. This paper analyses results from a survey of 71 Iran manufacturing companies in order to identify some of the factors for agile organizations in managing their supply chains. Then we classification this company in four cluster with fuzzy c-mean technique and with Four validations functional determine automatically the optimal number of clusters.

Keywords: agile supply chain, clustering, fuzzy clustering, business engineering

Procedia PDF Downloads 673