Search results for: spectral clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1337

Search results for: spectral clustering

1217 Web Proxy Detection via Bipartite Graphs and One-Mode Projections

Authors: Zhipeng Chen, Peng Zhang, Qingyun Liu, Li Guo

Abstract:

With the Internet becoming the dominant channel for business and life, many IPs are increasingly masked using web proxies for illegal purposes such as propagating malware, impersonate phishing pages to steal sensitive data or redirect victims to other malicious targets. Moreover, as Internet traffic continues to grow in size and complexity, it has become an increasingly challenging task to detect the proxy service due to their dynamic update and high anonymity. In this paper, we present an approach based on behavioral graph analysis to study the behavior similarity of web proxy users. Specifically, we use bipartite graphs to model host communications from network traffic and build one-mode projections of bipartite graphs for discovering social-behavior similarity of web proxy users. Based on the similarity matrices of end-users from the derived one-mode projection graphs, we apply a simple yet effective spectral clustering algorithm to discover the inherent web proxy users behavior clusters. The web proxy URL may vary from time to time. Still, the inherent interest would not. So, based on the intuition, by dint of our private tools implemented by WebDriver, we examine whether the top URLs visited by the web proxy users are web proxies. Our experiment results based on real datasets show that the behavior clusters not only reduce the number of URLs analysis but also provide an effective way to detect the web proxies, especially for the unknown web proxies.

Keywords: bipartite graph, one-mode projection, clustering, web proxy detection

Procedia PDF Downloads 245
1216 Issue Reorganization Using the Measure of Relevance

Authors: William Wong Xiu Shun, Yoonjin Hyun, Mingyu Kim, Seongi Choi, Namgyu Kim

Abstract:

Recently, the demand of extracting the R&D keywords from the issues and using them in retrieving R&D information is increasing rapidly. But it is hard to identify the related issues or to distinguish them. Although the similarity between the issues cannot be identified, but with the R&D lexicon, the issues that always shared the same R&D keywords can be determined. In details, the R&D keywords that associated with particular issue is implied the key technology elements that needed to solve the problem of the particular issue. Furthermore, the related issues that sharing the same R&D keywords can be showed in a more systematic way through the issue clustering constructed from the perspective of R&D. Thus, sharing of the R&D result and reusable of the R&D technology can be facilitated. Indirectly, the redundancy of investment on the same R&D can be reduce as the R&D information can be shared between those corresponding issues and reusability of the related R&D can be improved. Therefore, a methodology of constructing an issue clustering from the perspective of common R&D keywords is proposed to satisfy the demands mentioned.

Keywords: clustering, social network analysis, text mining, topic analysis

Procedia PDF Downloads 573
1215 A Hybrid Method for Determination of Effective Poles Using Clustering Dominant Pole Algorithm

Authors: Anuj Abraham, N. Pappa, Daniel Honc, Rahul Sharma

Abstract:

In this paper, an analysis of some model order reduction techniques is presented. A new hybrid algorithm for model order reduction of linear time invariant systems is compared with the conventional techniques namely Balanced Truncation, Hankel Norm reduction and Dominant Pole Algorithm (DPA). The proposed hybrid algorithm is known as Clustering Dominant Pole Algorithm (CDPA) is able to compute the full set of dominant poles and its cluster center efficiently. The dominant poles of a transfer function are specific eigenvalues of the state space matrix of the corresponding dynamical system. The effectiveness of this novel technique is shown through the simulation results.

Keywords: balanced truncation, clustering, dominant pole, Hankel norm, model reduction

Procedia PDF Downloads 599
1214 A Neural Network Based Clustering Approach for Imputing Multivariate Values in Big Data

Authors: S. Nickolas, Shobha K.

Abstract:

The treatment of incomplete data is an important step in the data pre-processing. Missing values creates a noisy environment in all applications and it is an unavoidable problem in big data management and analysis. Numerous techniques likes discarding rows with missing values, mean imputation, expectation maximization, neural networks with evolutionary algorithms or optimized techniques and hot deck imputation have been introduced by researchers for handling missing data. Among these, imputation techniques plays a positive role in filling missing values when it is necessary to use all records in the data and not to discard records with missing values. In this paper we propose a novel artificial neural network based clustering algorithm, Adaptive Resonance Theory-2(ART2) for imputation of missing values in mixed attribute data sets. The process of ART2 can recognize learned models fast and be adapted to new objects rapidly. It carries out model-based clustering by using competitive learning and self-steady mechanism in dynamic environment without supervision. The proposed approach not only imputes the missing values but also provides information about handling the outliers.

Keywords: ART2, data imputation, clustering, missing data, neural network, pre-processing

Procedia PDF Downloads 274
1213 Structure Clustering for Milestoning Applications of Complex Conformational Transitions

Authors: Amani Tahat, Serdal Kirmizialtin

Abstract:

Trajectory fragment methods such as Markov State Models (MSM), Milestoning (MS) and Transition Path sampling are the prime choice of extending the timescale of all atom Molecular Dynamics simulations. In these approaches, a set of structures that covers the accessible phase space has to be chosen a priori using cluster analysis. Structural clustering serves to partition the conformational state into natural subgroups based on their similarity, an essential statistical methodology that is used for analyzing numerous sets of empirical data produced by Molecular Dynamics (MD) simulations. Local transition kernel among these clusters later used to connect the metastable states using a Markovian kinetic model in MSM and a non-Markovian model in MS. The choice of clustering approach in constructing such kernel is crucial since the high dimensionality of the biomolecular structures might easily confuse the identification of clusters when using the traditional hierarchical clustering methodology. Of particular interest, in the case of MS where the milestones are very close to each other, accurate determination of the milestone identity of the trajectory becomes a challenging issue. Throughout this work we present two cluster analysis methods applied to the cis–trans isomerism of dinucleotide AA. The choice of nucleic acids to commonly used proteins to study the cluster analysis is two fold: i) the energy landscape is rugged; hence transitions are more complex, enabling a more realistic model to study conformational transitions, ii) Nucleic acids conformational space is high dimensional. A diverse set of internal coordinates is necessary to describe the metastable states in nucleic acids, posing a challenge in studying the conformational transitions. Herein, we need improved clustering methods that accurately identify the AA structure in its metastable states in a robust way for a wide range of confused data conditions. The single linkage approach of the hierarchical clustering available in GROMACS MD-package is the first clustering methodology applied to our data. Self Organizing Map (SOM) neural network, that also known as a Kohonen network, is the second data clustering methodology. The performance comparison of the neural network as well as hierarchical clustering method is studied by means of computing the mean first passage times for the cis-trans conformational rates. Our hope is that this study provides insight into the complexities and need in determining the appropriate clustering algorithm for kinetic analysis. Our results can improve the effectiveness of decisions based on clustering confused empirical data in studying conformational transitions in biomolecules.

Keywords: milestoning, self organizing map, single linkage, structure clustering

Procedia PDF Downloads 224
1212 Algorithm and Software Based on Multilayer Perceptron Neural Networks for Estimating Channel Use in the Spectral Decision Stage in Cognitive Radio Networks

Authors: Danilo López, Johana Hernández, Edwin Rivas

Abstract:

The use of the Multilayer Perceptron Neural Networks (MLPNN) technique is presented to estimate the future state of use of a licensed channel by primary users (PUs); this will be useful at the spectral decision stage in cognitive radio networks (CRN) to determine approximately in which time instants of future may secondary users (SUs) opportunistically use the spectral bandwidth to send data through the primary wireless network. To validate the results, sequences of occupancy data of channel were generated by simulation. The results show that the prediction percentage is greater than 60% in some of the tests carried out.

Keywords: cognitive radio, neural network, prediction, primary user

Procedia PDF Downloads 371
1211 Analyzing the Results of Buildings Energy Audit by Using Grey Set Theory

Authors: Tooraj Karimi, Mohammadreza Sadeghi Moghadam

Abstract:

Grey set theory has the advantage of using fewer data to analyze many factors, and it is therefore more appropriate for system study rather than traditional statistical regression which require massive data, normal distribution in the data and few variant factors. So, in this paper grey clustering and entropy of coefficient vector of grey evaluations are used to analyze energy consumption in buildings of the Oil Ministry in Tehran. In fact, this article intends to analyze the results of energy audit reports and defines most favorable characteristics of system, which is energy consumption of buildings, and most favorable factors affecting these characteristics in order to modify and improve them. According to the results of the model, ‘the real Building Load Coefficient’ has been selected as the most important system characteristic and ‘uncontrolled area of the building’ has been diagnosed as the most favorable factor which has the greatest effect on energy consumption of building. Grey clustering in this study has been used for two purposes: First, all the variables of building relate to energy audit cluster in two main groups of indicators and the number of variables is reduced. Second, grey clustering with variable weights has been used to classify all buildings in three categories named ‘no standard deviation’, ‘low standard deviation’ and ‘non- standard’. Entropy of coefficient vector of Grey evaluations is calculated to investigate greyness of results. It shows that among the 38 buildings surveyed in terms of energy consumption, 3 cases are in standard group, 24 cases are in ‘low standard deviation’ group and 11 buildings are completely non-standard. In addition, clustering greyness of 13 buildings is less than 0.5 and average uncertainly of clustering results is 66%.

Keywords: energy audit, grey set theory, grey incidence matrixes, grey clustering, Iran oil ministry

Procedia PDF Downloads 373
1210 Design of a Graphical User Interface for Data Preprocessing and Image Segmentation Process in 2D MRI Images

Authors: Enver Kucukkulahli, Pakize Erdogmus, Kemal Polat

Abstract:

The 2D image segmentation is a significant process in finding a suitable region in medical images such as MRI, PET, CT etc. In this study, we have focused on 2D MRI images for image segmentation process. We have designed a GUI (graphical user interface) written in MATLABTM for 2D MRI images. In this program, there are two different interfaces including data pre-processing and image clustering or segmentation. In the data pre-processing section, there are median filter, average filter, unsharp mask filter, Wiener filter, and custom filter (a filter that is designed by user in MATLAB). As for the image clustering, there are seven different image segmentations for 2D MR images. These image segmentation algorithms are as follows: PSO (particle swarm optimization), GA (genetic algorithm), Lloyds algorithm, k-means, the combination of Lloyds and k-means, mean shift clustering, and finally BBO (Biogeography Based Optimization). To find the suitable cluster number in 2D MRI, we have designed the histogram based cluster estimation method and then applied to these numbers to image segmentation algorithms to cluster an image automatically. Also, we have selected the best hybrid method for each 2D MR images thanks to this GUI software.

Keywords: image segmentation, clustering, GUI, 2D MRI

Procedia PDF Downloads 377
1209 Analysis of Spectral Radiative Entropy Generation in a Non-Gray Participating Medium with Heat Source (Furnaces)

Authors: Asadollah Bahrami

Abstract:

In the present study, spectral radiative entropy generation is analyzed in a furnace filled with a mixture of H₂O, CO₂ and soot at radiative equilibrium. For the angular and spatial discretization of the radiative transfer equation and radiative entropy generation equations, the discrete ordinates method and the finite volume method are used, respectively. Spectral radiative properties are obtained using the correlated-k (CK) non-gray model with updated parameters based on the HITEMP2010 high-resolution database. In order to evaluate the effects of the location of the heat source, boundary condition and wall emissivity on radiative entropy generation, five cases are considered with different conditions. The spectral and total radiative entropy generation in the system are calculated for all cases and the effects of mentioned parameters on radiative entropy generation are attentively analyzed and finally, the optimum condition is especially presented. The most important results can be stated as follows: Results demonstrate that the wall emissivity has a considerable effect on the radiative entropy generation. Also, irreversible radiative transfer at the wall with lower temperatures is the main source of radiative entropy generation in the furnaces. In addition, the effect of the location of the heat source on total radiative entropy generation is less than other factors. Eventually, it can be said that characterizing the effective parameters of radiative entropy generation provides an approach to minimizing the radiative entropy generation and enhancing the furnace's performance practicality.

Keywords: spectral radiative entropy generation, non-gray medium, correlated k(CK) model, heat source

Procedia PDF Downloads 103
1208 Extraction of Urban Land Features from TM Landsat Image Using the Land Features Index and Tasseled Cap Transformation

Authors: R. Bouhennache, T. Bouden, A. A. Taleb, A. Chaddad

Abstract:

In this paper we propose a method to map the urban areas. The method uses an arithmetic calculation processed from the land features indexes and Tasseled cap transformation TC of multi spectral Thematic Mapper Landsat TM image. For this purpose the derived indexes image from the original image such SAVI the soil adjusted vegetation index, UI the urban Index, and EBBI the enhanced built up and bareness index were staked to form a new image and the bands were uncorrelated, also the Spectral Angle Mapper (SAM) and Spectral Information Divergence (SID) supervised classification approaches were first applied on the new image TM data using the reference spectra of the spectral library and subsequently the four urban, vegetation, water and soil land cover categories were extracted with their accuracy assessment.The urban features were represented using a logic calculation applied to the brightness, UI-SAVI, NDBI-greenness and EBBI- brightness data sets. The study applied to Blida and mentioned that the urban features can be mapped with an accuracy ranging from 92 % to 95%.

Keywords: EBBI, SAVI, Tasseled Cap Transformation, UI

Procedia PDF Downloads 482
1207 Linking Soil Spectral Behavior and Moisture Content for Soil Moisture Content Retrieval at Field Scale

Authors: Yonwaba Atyosi, Moses Cho, Abel Ramoelo, Nobuhle Majozi, Cecilia Masemola, Yoliswa Mkhize

Abstract:

Spectroscopy has been widely used to understand the hyperspectral remote sensing of soils. Accurate and efficient measurement of soil moisture is essential for precision agriculture. The aim of this study was to understand the spectral behavior of soil at different soil water content levels and identify the significant spectral bands for soil moisture content retrieval at field-scale. The study consisted of 60 soil samples from a maize farm, divided into four different treatments representing different moisture levels. Spectral signatures were measured for each sample in laboratory under artificial light using an Analytical Spectral Device (ASD) spectrometer, covering a wavelength range from 350 nm to 2500 nm, with a spectral resolution of 1 nm. The results showed that the absorption features at 1450 nm, 1900 nm, and 2200 nm were particularly sensitive to soil moisture content and exhibited strong correlations with the water content levels. Continuum removal was developed in the R programming language to enhance the absorption features of soil moisture and to precisely understand its spectral behavior at different water content levels. Statistical analysis using partial least squares regression (PLSR) models were performed to quantify the correlation between the spectral bands and soil moisture content. This study provides insights into the spectral behavior of soil at different water content levels and identifies the significant spectral bands for soil moisture content retrieval. The findings highlight the potential of spectroscopy for non-destructive and rapid soil moisture measurement, which can be applied to various fields such as precision agriculture, hydrology, and environmental monitoring. However, it is important to note that the spectral behavior of soil can be influenced by various factors such as soil type, texture, and organic matter content, and caution should be taken when applying the results to other soil systems. The results of this study showed a good agreement between measured and predicted values of Soil Moisture Content with high R2 and low root mean square error (RMSE) values. Model validation using independent data was satisfactory for all the studied soil samples. The results has significant implications for developing high-resolution and precise field-scale soil moisture retrieval models. These models can be used to understand the spatial and temporal variation of soil moisture content in agricultural fields, which is essential for managing irrigation and optimizing crop yield.

Keywords: soil moisture content retrieval, precision agriculture, continuum removal, remote sensing, machine learning, spectroscopy

Procedia PDF Downloads 99
1206 Enhancement of Pulsed Eddy Current Response Based on Power Spectral Density after Continuous Wavelet Transform Decomposition

Authors: A. Benyahia, M. Zergoug, M. Amir, M. Fodil

Abstract:

The main objective of this work is to enhance the Pulsed Eddy Current (PEC) response from the aluminum structure using signal processing. Cracks and metal loss in different structures cause changes in PEC response measurements. In this paper, time-frequency analysis is used to represent PEC response, which generates a large quantity of data and reduce the noise due to measurement. Power Spectral Density (PSD) after Wavelet Decomposition (PSD-WD) is proposed for defect detection. The experimental results demonstrate that the cracks in the surface can be extracted satisfactorily by the proposed methods. The validity of the proposed method is discussed.

Keywords: DT, pulsed eddy current, continuous wavelet transform, Mexican hat wavelet mother, defect detection, power spectral density.

Procedia PDF Downloads 236
1205 Using Genetic Algorithms and Rough Set Based Fuzzy K-Modes to Improve Centroid Model Clustering Performance on Categorical Data

Authors: Rishabh Srivastav, Divyam Sharma

Abstract:

We propose an algorithm to cluster categorical data named as ‘Genetic algorithm initialized rough set based fuzzy K-Modes for categorical data’. We propose an amalgamation of the simple K-modes algorithm, the Rough and Fuzzy set based K-modes and the Genetic Algorithm to form a new algorithm,which we hypothesise, will provide better Centroid Model clustering results, than existing standard algorithms. In the proposed algorithm, the initialization and updation of modes is done by the use of genetic algorithms while the membership values are calculated using the rough set and fuzzy logic.

Keywords: categorical data, fuzzy logic, genetic algorithm, K modes clustering, rough sets

Procedia PDF Downloads 246
1204 Hybridized Approach for Distance Estimation Using K-Means Clustering

Authors: Ritu Vashistha, Jitender Kumar

Abstract:

Clustering using the K-means algorithm is a very common way to understand and analyze the obtained output data. When a similar object is grouped, this is called the basis of Clustering. There is K number of objects and C number of cluster in to single cluster in which k is always supposed to be less than C having each cluster to be its own centroid but the major problem is how is identify the cluster is correct based on the data. Formulation of the cluster is not a regular task for every tuple of row record or entity but it is done by an iterative process. Each and every record, tuple, entity is checked and examined and similarity dissimilarity is examined. So this iterative process seems to be very lengthy and unable to give optimal output for the cluster and time taken to find the cluster. To overcome the drawback challenge, we are proposing a formula to find the clusters at the run time, so this approach can give us optimal results. The proposed approach uses the Euclidian distance formula as well melanosis to find the minimum distance between slots as technically we called clusters and the same approach we have also applied to Ant Colony Optimization(ACO) algorithm, which results in the production of two and multi-dimensional matrix.

Keywords: ant colony optimization, data clustering, centroids, data mining, k-means

Procedia PDF Downloads 128
1203 Health Trajectory Clustering Using Deep Belief Networks

Authors: Farshid Hajati, Federico Girosi, Shima Ghassempour

Abstract:

We present a Deep Belief Network (DBN) method for clustering health trajectories. Deep Belief Network (DBN) is a deep architecture that consists of a stack of Restricted Boltzmann Machines (RBM). In a deep architecture, each layer learns more complex features than the past layers. The proposed method depends on DBN in clustering without using back propagation learning algorithm. The proposed DBN has a better a performance compared to the deep neural network due the initialization of the connecting weights. We use Contrastive Divergence (CD) method for training the RBMs which increases the performance of the network. The performance of the proposed method is evaluated extensively on the Health and Retirement Study (HRS) database. The University of Michigan Health and Retirement Study (HRS) is a nationally representative longitudinal study that has surveyed more than 27,000 elderly and near-elderly Americans since its inception in 1992. Participants are interviewed every two years and they collect data on physical and mental health, insurance coverage, financial status, family support systems, labor market status, and retirement planning. The dataset is publicly available and we use the RAND HRS version L, which is easy to use and cleaned up version of the data. The size of sample data set is 268 and the length of the trajectories is equal to 10. The trajectories do not stop when the patient dies and represent 10 different interviews of live patients. Compared to the state-of-the-art benchmarks, the experimental results show the effectiveness and superiority of the proposed method in clustering health trajectories.

Keywords: health trajectory, clustering, deep learning, DBN

Procedia PDF Downloads 369
1202 Spectral Coherence Analysis between Grinding Interaction Forces and the Relative Motion of the Workpiece and the Cutting Tool

Authors: Abdulhamit Donder, Erhan Ilhan Konukseven

Abstract:

Grinding operation is performed in order to obtain desired surfaces precisely in machining process. The needed relative motion between the cutting tool and the workpiece is generally created either by the movement of the cutting tool or by the movement of the workpiece or by the movement of both of them as in our case. For all these cases, the coherence level between the movements and the interaction forces is a key influential parameter for efficient grinding. Therefore, in this work, spectral coherence analysis has been performed to investigate the coherence level between grinding interaction forces and the movement of the workpiece on our robotic-grinding experimental setup in METU Mechatronics Laboratory.

Keywords: coherence analysis, correlation, FFT, grinding, hanning window, machining, Piezo actuator, reverse arrangements test, spectral analysis

Procedia PDF Downloads 405
1201 Clustering of Panels and Shade Diffusion Techniques for Partially Shaded PV Array-Review

Authors: Shahida Khatoon, Mohd. Faisal Jalil, Vaishali Gautam

Abstract:

The Photovoltaic (PV) generated power is mainly dependent on environmental factors. The PV array’s lifetime and overall systems effectiveness reduce due to the partial shading condition. Clustering the electrical connections between solar modules is a viable strategy for minimizing these power losses by shade diffusion. This article comprehensively evaluates various PV array clustering/reconfiguration models for PV systems. These are static and dynamic reconfiguration techniques for extracting maximum power in mismatch conditions. This paper explores and analyzes current breakthroughs in solar PV performance improvement strategies that merit further investigation. Altogether, researchers and academicians working in the field of dedicated solar power generation will benefit from this research.

Keywords: static reconfiguration, dynamic reconfiguration, photo voltaic array, partial shading, CTC configuration

Procedia PDF Downloads 116
1200 Probabilistic Graphical Model for the Web

Authors: M. Nekri, A. Khelladi

Abstract:

The world wide web network is a network with a complex topology, the main properties of which are the distribution of degrees in power law, A low clustering coefficient and a weak average distance. Modeling the web as a graph allows locating the information in little time and consequently offering a help in the construction of the research engine. Here, we present a model based on the already existing probabilistic graphs with all the aforesaid characteristics. This work will consist in studying the web in order to know its structuring thus it will enable us to modelize it more easily and propose a possible algorithm for its exploration.

Keywords: clustering coefficient, preferential attachment, small world, web community

Procedia PDF Downloads 272
1199 Clustering-Based Computational Workload Minimization in Ontology Matching

Authors: Mansir Abubakar, Hazlina Hamdan, Norwati Mustapha, Teh Noranis Mohd Aris

Abstract:

In order to build a matching pattern for each class correspondences of ontology, it is required to specify a set of attribute correspondences across two corresponding classes by clustering. Clustering reduces the size of potential attribute correspondences considered in the matching activity, which will significantly reduce the computation workload; otherwise, all attributes of a class should be compared with all attributes of the corresponding class. Most existing ontology matching approaches lack scalable attributes discovery methods, such as cluster-based attribute searching. This problem makes ontology matching activity computationally expensive. It is therefore vital in ontology matching to design a scalable element or attribute correspondence discovery method that would reduce the size of potential elements correspondences during mapping thereby reduce the computational workload in a matching process as a whole. The objective of this work is 1) to design a clustering method for discovering similar attributes correspondences and relationships between ontologies, 2) to discover element correspondences by classifying elements of each class based on element’s value features using K-medoids clustering technique. Discovering attribute correspondence is highly required for comparing instances when matching two ontologies. During the matching process, any two instances across two different data sets should be compared to their attribute values, so that they can be regarded to be the same or not. Intuitively, any two instances that come from classes across which there is a class correspondence are likely to be identical to each other. Besides, any two instances that hold more similar attribute values are more likely to be matched than the ones with less similar attribute values. Most of the time, similar attribute values exist in the two instances across which there is an attribute correspondence. This work will present how to classify attributes of each class with K-medoids clustering, then, clustered groups to be mapped by their statistical value features. We will also show how to map attributes of a clustered group to attributes of the mapped clustered group, generating a set of potential attribute correspondences that would be applied to generate a matching pattern. The K-medoids clustering phase would largely reduce the number of attribute pairs that are not corresponding for comparing instances as only the coverage probability of attributes pairs that reaches 100% and attributes above the specified threshold can be considered as potential attributes for a matching. Using clustering will reduce the size of potential elements correspondences to be considered during mapping activity, which will in turn reduce the computational workload significantly. Otherwise, all element of the class in source ontology have to be compared with all elements of the corresponding classes in target ontology. K-medoids can ably cluster attributes of each class, so that a proportion of attribute pairs that are not corresponding would not be considered when constructing the matching pattern.

Keywords: attribute correspondence, clustering, computational workload, k-medoids clustering, ontology matching

Procedia PDF Downloads 248
1198 Aliasing Free and Additive Error in Spectra for Alpha Stable Signals

Authors: R. Sabre

Abstract:

This work focuses on the symmetric alpha stable process with continuous time frequently used in modeling the signal with indefinitely growing variance, often observed with an unknown additive error. The objective of this paper is to estimate this error from discrete observations of the signal. For that, we propose a method based on the smoothing of the observations via Jackson polynomial kernel and taking into account the width of the interval where the spectral density is non-zero. This technique allows avoiding the “Aliasing phenomenon” encountered when the estimation is made from the discrete observations of a process with continuous time. We have studied the convergence rate of the estimator and have shown that the convergence rate improves in the case where the spectral density is zero at the origin. Thus, we set up an estimator of the additive error that can be subtracted for approaching the original signal without error.

Keywords: spectral density, stable processes, aliasing, non parametric

Procedia PDF Downloads 129
1197 Multi-Spectral Deep Learning Models for Forest Fire Detection

Authors: Smitha Haridasan, Zelalem Demissie, Atri Dutta, Ajita Rattani

Abstract:

Aided by the wind, all it takes is one ember and a few minutes to create a wildfire. Wildfires are growing in frequency and size due to climate change. Wildfires and its consequences are one of the major environmental concerns. Every year, millions of hectares of forests are destroyed over the world, causing mass destruction and human casualties. Thus early detection of wildfire becomes a critical component to mitigate this threat. Many computer vision-based techniques have been proposed for the early detection of forest fire using video surveillance. Several computer vision-based methods have been proposed to predict and detect forest fires at various spectrums, namely, RGB, HSV, and YCbCr. The aim of this paper is to propose a multi-spectral deep learning model that combines information from different spectrums at intermediate layers for accurate fire detection. A heterogeneous dataset assembled from publicly available datasets is used for model training and evaluation in this study. The experimental results show that multi-spectral deep learning models could obtain an improvement of about 4.68 % over those based on a single spectrum for fire detection.

Keywords: deep learning, forest fire detection, multi-spectral learning, natural hazard detection

Procedia PDF Downloads 241
1196 Analysis of Expression Data Using Unsupervised Techniques

Authors: M. A. I Perera, C. R. Wijesinghe, A. R. Weerasinghe

Abstract:

his study was conducted to review and identify the unsupervised techniques that can be employed to analyze gene expression data in order to identify better subtypes of tumors. Identifying subtypes of cancer help in improving the efficacy and reducing the toxicity of the treatments by identifying clues to find target therapeutics. Process of gene expression data analysis described under three steps as preprocessing, clustering, and cluster validation. Feature selection is important since the genomic data are high dimensional with a large number of features compared to samples. Hierarchical clustering and K Means are often used in the analysis of gene expression data. There are several cluster validation techniques used in validating the clusters. Heatmaps are an effective external validation method that allows comparing the identified classes with clinical variables and visual analysis of the classes.

Keywords: cancer subtypes, gene expression data analysis, clustering, cluster validation

Procedia PDF Downloads 149
1195 Clustering of Association Rules of ISIS & Al-Qaeda Based on Similarity Measures

Authors: Tamanna Goyal, Divya Bansal, Sanjeev Sofat

Abstract:

In world-threatening terrorist attacks, where early detection, distinction, and prediction are effective diagnosis techniques and for functionally accurate and precise analysis of terrorism data, there are so many data mining & statistical approaches to assure accuracy. The computational extraction of derived patterns is a non-trivial task which comprises specific domain discovery by means of sophisticated algorithm design and analysis. This paper proposes an approach for similarity extraction by obtaining the useful attributes from the available datasets of terrorist attacks and then applying feature selection technique based on the statistical impurity measures followed by clustering techniques on the basis of similarity measures. On the basis of degree of participation of attributes in the rules, the associative dependencies between the attacks are analyzed. Consequently, to compute the similarity among the discovered rules, we applied a weighted similarity measure. Finally, the rules are grouped by applying using hierarchical clustering. We have applied it to an open source dataset to determine the usability and efficiency of our technique, and a literature search is also accomplished to support the efficiency and accuracy of our results.

Keywords: association rules, clustering, similarity measure, statistical approaches

Procedia PDF Downloads 320
1194 A Technique for Image Segmentation Using K-Means Clustering Classification

Authors: Sadia Basar, Naila Habib, Awais Adnan

Abstract:

The paper presents the Technique for Image Segmentation Using K-Means Clustering Classification. The presented algorithms were specific, however, missed the neighboring information and required high-speed computerized machines to run the segmentation algorithms. Clustering is the process of partitioning a group of data points into a small number of clusters. The proposed method is content-aware and feature extraction method which is able to run on low-end computerized machines, simple algorithm, required low-quality streaming, efficient and used for security purpose. It has the capability to highlight the boundary and the object. At first, the user enters the data in the representation of the input. Then in the next step, the digital image is converted into groups clusters. Clusters are divided into many regions. The same categories with same features of clusters are assembled within a group and different clusters are placed in other groups. Finally, the clusters are combined with respect to similar features and then represented in the form of segments. The clustered image depicts the clear representation of the digital image in order to highlight the regions and boundaries of the image. At last, the final image is presented in the form of segments. All colors of the image are separated in clusters.

Keywords: clustering, image segmentation, K-means function, local and global minimum, region

Procedia PDF Downloads 376
1193 Automatic Classification for the Degree of Disc Narrowing from X-Ray Images Using CNN

Authors: Kwangmin Joo

Abstract:

Automatic detection of lumbar vertebrae and classification method is proposed for evaluating the degree of disc narrowing. Prior to classification, deep learning based segmentation is applied to detect individual lumbar vertebra. M-net is applied to segment five lumbar vertebrae and fine-tuning segmentation is employed to improve the accuracy of segmentation. Using the features extracted from previous step, clustering technique, k-means clustering, is applied to estimate the degree of disc space narrowing under four grade scoring system. As preliminary study, techniques proposed in this research could help building an automatic scoring system to diagnose the severity of disc narrowing from X-ray images.

Keywords: Disc space narrowing, Degenerative disc disorders, Deep learning based segmentation, Clustering technique

Procedia PDF Downloads 125
1192 A Novel Method for Silence Removal in Sounds Produced by Percussive Instruments

Authors: B. Kishore Kumar, Rakesh Pogula, T. Kishore Kumar

Abstract:

The steepness of an audio signal which is produced by the musical instruments, specifically percussive instruments is the perception of how high tone or low tone which can be considered as a frequency closely related to the fundamental frequency. This paper presents a novel method for silence removal and segmentation of music signals produced by the percussive instruments and the performance of proposed method is studied with the help of MATLAB simulations. This method is based on two simple features, namely the signal energy and the spectral centroid. As long as the feature sequences are extracted, a simple thresholding criterion is applied in order to remove the silence areas in the sound signal. The simulations were carried on various instruments like drum, flute and guitar and results of the proposed method were analyzed.

Keywords: percussive instruments, spectral energy, spectral centroid, silence removal

Procedia PDF Downloads 411
1191 An Analysis on Clustering Based Gene Selection and Classification for Gene Expression Data

Authors: K. Sathishkumar, V. Thiagarasu

Abstract:

Due to recent advances in DNA microarray technology, it is now feasible to obtain gene expression profiles of tissue samples at relatively low costs. Many scientists around the world use the advantage of this gene profiling to characterize complex biological circumstances and diseases. Microarray techniques that are used in genome-wide gene expression and genome mutation analysis help scientists and physicians in understanding of the pathophysiological mechanisms, in diagnoses and prognoses, and choosing treatment plans. DNA microarray technology has now made it possible to simultaneously monitor the expression levels of thousands of genes during important biological processes and across collections of related samples. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. A first step toward addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. This work presents an analysis of several clustering algorithms proposed to deals with the gene expression data effectively. The existing clustering algorithms like Support Vector Machine (SVM), K-means algorithm and evolutionary algorithm etc. are analyzed thoroughly to identify the advantages and limitations. The performance evaluation of the existing algorithms is carried out to determine the best approach. In order to improve the classification performance of the best approach in terms of Accuracy, Convergence Behavior and processing time, a hybrid clustering based optimization approach has been proposed.

Keywords: microarray technology, gene expression data, clustering, gene Selection

Procedia PDF Downloads 323
1190 An Enhanced Distributed Weighted Clustering Algorithm for Intra and Inter Cluster Routing in MANET

Authors: K. Gomathi

Abstract:

Mobile Ad hoc Networks (MANET) is defined as collection of routable wireless mobile nodes with no centralized administration and communicate each other using radio signals. Especially MANETs deployed in hostile environments where hackers will try to disturb the secure data transfer and drain the valuable network resources. Since MANET is battery operated network, preserving the network resource is essential one. For resource constrained computation, efficient routing and to increase the network stability, the network is divided into smaller groups called clusters. The clustering architecture consists of Cluster Head(CH), ordinary node and gateway. The CH is responsible for inter and intra cluster routing. CH election is a prominent research area and many more algorithms are developed using many different metrics. The CH with longer life sustains network lifetime, for this purpose Secondary Cluster Head(SCH) also elected and it is more economical. To nominate efficient CH, a Enhanced Distributed Weighted Clustering Algorithm (EDWCA) has been proposed. This approach considers metrics like battery power, degree difference and speed of the node for CH election. The proficiency of proposed one is evaluated and compared with existing algorithm using Network Simulator(NS-2).

Keywords: MANET, EDWCA, clustering, cluster head

Procedia PDF Downloads 398
1189 Robust Features for Impulsive Noisy Speech Recognition Using Relative Spectral Analysis

Authors: Hajer Rahali, Zied Hajaiej, Noureddine Ellouze

Abstract:

The goal of speech parameterization is to extract the relevant information about what is being spoken from the audio signal. In speech recognition systems Mel-Frequency Cepstral Coefficients (MFCC) and Relative Spectral Mel-Frequency Cepstral Coefficients (RASTA-MFCC) are the two main techniques used. It will be shown in this paper that it presents some modifications to the original MFCC method. In our work the effectiveness of proposed changes to MFCC called Modified Function Cepstral Coefficients (MODFCC) were tested and compared against the original MFCC and RASTA-MFCC features. The prosodic features such as jitter and shimmer are added to baseline spectral features. The above-mentioned techniques were tested with impulsive signals under various noisy conditions within AURORA databases.

Keywords: auditory filter, impulsive noise, MFCC, prosodic features, RASTA filter

Procedia PDF Downloads 425
1188 Data Mining Spatial: Unsupervised Classification of Geographic Data

Authors: Chahrazed Zouaoui

Abstract:

In recent years, the volume of geospatial information is increasing due to the evolution of communication technologies and information, this information is presented often by geographic information systems (GIS) and stored on of spatial databases (BDS). The classical data mining revealed a weakness in knowledge extraction at these enormous amounts of data due to the particularity of these spatial entities, which are characterized by the interdependence between them (1st law of geography). This gave rise to spatial data mining. Spatial data mining is a process of analyzing geographic data, which allows the extraction of knowledge and spatial relationships from geospatial data, including methods of this process we distinguish the monothematic and thematic, geo- Clustering is one of the main tasks of spatial data mining, which is registered in the part of the monothematic method. It includes geo-spatial entities similar in the same class and it affects more dissimilar to the different classes. In other words, maximize intra-class similarity and minimize inter similarity classes. Taking account of the particularity of geo-spatial data. Two approaches to geo-clustering exist, the dynamic processing of data involves applying algorithms designed for the direct treatment of spatial data, and the approach based on the spatial data pre-processing, which consists of applying clustering algorithms classic pre-processed data (by integration of spatial relationships). This approach (based on pre-treatment) is quite complex in different cases, so the search for approximate solutions involves the use of approximation algorithms, including the algorithms we are interested in dedicated approaches (clustering methods for partitioning and methods for density) and approaching bees (biomimetic approach), our study is proposed to design very significant to this problem, using different algorithms for automatically detecting geo-spatial neighborhood in order to implement the method of geo- clustering by pre-treatment, and the application of the bees algorithm to this problem for the first time in the field of geo-spatial.

Keywords: mining, GIS, geo-clustering, neighborhood

Procedia PDF Downloads 375