Search results for: Sparse datasets.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 326

Search results for: Sparse datasets.

326 A Patricia-Tree Approach for Frequent Closed Itemsets

Authors: Moez Ben Hadj Hamida, Yahya SlimaniI

Abstract:

In this paper, we propose an adaptation of the Patricia-Tree for sparse datasets to generate non redundant rule associations. Using this adaptation, we can generate frequent closed itemsets that are more compact than frequent itemsets used in Apriori approach. This adaptation has been experimented on a set of datasets benchmarks.

Keywords: Datamining, Frequent itemsets, Frequent closeditemsets, Sparse datasets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1840
325 Sparse Frequencies Extracting from Partial Phase-Only Measurements

Authors: R. Fan, Q. Wan, H. Chen, Y.L. Liu, Y.P. Liu

Abstract:

This paper considers a robust recovery of sparse frequencies from partial phase-only measurements. With the proposed method, sparse frequencies can be reconstructed, which makes full use of the sparse distribution in the Fourier representation of the complex-valued time signal. Simulation experiments illustrate the proposed method-s advantages over conventional methods in both noiseless and additive white Gaussian noise cases.

Keywords: Sparse signal recovery, phase-only measurements, Compressive sensing, convex relaxation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1424
324 An Improved Method to Compute Sparse Graphs for Traveling Salesman Problem

Authors: Y. Wang

Abstract:

The Traveling salesman problem (TSP) is NP-hard in combinatorial optimization. The research shows the algorithms for TSP on the sparse graphs have the shorter computation time than those for TSP according to the complete graphs. We present an improved iterative algorithm to compute the sparse graphs for TSP by frequency graphs computed with frequency quadrilaterals. The iterative algorithm is enhanced by adjusting two parameters of the algorithm. The computation time of the algorithm is O(CNmaxn2) where C is the iterations, Nmax is the maximum number of frequency quadrilaterals containing each edge and n is the scale of TSP. The experimental results showed the computed sparse graphs generally have less than 5n edges for most of these Euclidean instances. Moreover, the maximum degree and minimum degree of the vertices in the sparse graphs do not have much difference. Thus, the computation time of the methods to resolve the TSP on these sparse graphs will be greatly reduced.

Keywords: Frequency quadrilateral, iterative algorithm, sparse graph, traveling salesman problem.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 970
323 Feature Selection with Kohonen Self Organizing Classification Algorithm

Authors: Francesco Maiorana

Abstract:

In this paper a one-dimension Self Organizing Map algorithm (SOM) to perform feature selection is presented. The algorithm is based on a first classification of the input dataset on a similarity space. From this classification for each class a set of positive and negative features is computed. This set of features is selected as result of the procedure. The procedure is evaluated on an in-house dataset from a Knowledge Discovery from Text (KDT) application and on a set of publicly available datasets used in international feature selection competitions. These datasets come from KDT applications, drug discovery as well as other applications. The knowledge of the correct classification available for the training and validation datasets is used to optimize the parameters for positive and negative feature extractions. The process becomes feasible for large and sparse datasets, as the ones obtained in KDT applications, by using both compression techniques to store the similarity matrix and speed up techniques of the Kohonen algorithm that take advantage of the sparsity of the input matrix. These improvements make it feasible, by using the grid, the application of the methodology to massive datasets.

Keywords: Clustering algorithm, Data mining, Feature selection, Grid, Kohonen Self Organizing Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3007
322 Gene Expression Data Classification Using Discriminatively Regularized Sparse Subspace Learning

Authors: Chunming Xu

Abstract:

Sparse representation which can represent high dimensional data effectively has been successfully used in computer vision and pattern recognition problems. However, it doesn-t consider the label information of data samples. To overcome this limitation, we develop a novel dimensionality reduction algorithm namely dscriminatively regularized sparse subspace learning(DR-SSL) in this paper. The proposed DR-SSL algorithm can not only make use of the sparse representation to model the data, but also can effective employ the label information to guide the procedure of dimensionality reduction. In addition,the presented algorithm can effectively deal with the out-of-sample problem.The experiments on gene-expression data sets show that the proposed algorithm is an effective tool for dimensionality reduction and gene-expression data classification.

Keywords: sparse representation, dimensionality reduction, labelinformation, sparse subspace learning, gene-expression data classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1406
321 A Transform Domain Function Controlled VSSLMS Algorithm for Sparse System Identification

Authors: Cemil Turan, Mohammad Shukri Salman

Abstract:

The convergence rate of the least-mean-square (LMS) algorithm deteriorates if the input signal to the filter is correlated. In a system identification problem, this convergence rate can be improved if the signal is white and/or if the system is sparse. We recently proposed a sparse transform domain LMS-type algorithm that uses a variable step-size for a sparse system identification. The proposed algorithm provided high performance even if the input signal is highly correlated. In this work, we investigate the performance of the proposed TD-LMS algorithm for a large number of filter tap which is also a critical issue for standard LMS algorithm. Additionally, the optimum value of the most important parameter is calculated for all experiments. Moreover, the convergence analysis of the proposed algorithm is provided. The performance of the proposed algorithm has been compared to different algorithms in a sparse system identification setting of different sparsity levels and different number of filter taps. Simulations have shown that the proposed algorithm has prominent performance compared to the other algorithms.

Keywords: Adaptive filtering, sparse system identification, VSSLMS algorithm, TD-LMS algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 946
320 Finding Sparse Features in Face Detection Using Genetic Algorithms

Authors: H. Sagha, S. Kasaei, E. Enayati, M. Dehghani

Abstract:

Although Face detection is not a recent activity in the field of image processing, it is still an open area for research. The greatest step in this field is the work reported by Viola and its recent analogous is Huang et al. Both of them use similar features and also similar training process. The former is just for detecting upright faces, but the latter can detect multi-view faces in still grayscale images using new features called 'sparse feature'. Finding these features is very time consuming and inefficient by proposed methods. Here, we propose a new approach for finding sparse features using a genetic algorithm system. This method requires less computational cost and gets more effective features in learning process for face detection that causes more accuracy.

Keywords: Face Detection, Genetic Algorithms, Sparse Feature.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1538
319 Learning an Overcomplete Dictionary using a Cauchy Mixture Model for Sparse Decay

Authors: E. S. Gower, M. O. J. Hawksford

Abstract:

An algorithm for learning an overcomplete dictionary using a Cauchy mixture model for sparse decomposition of an underdetermined mixing system is introduced. The mixture density function is derived from a ratio sample of the observed mixture signals where 1) there are at least two but not necessarily more mixture signals observed, 2) the source signals are statistically independent and 3) the sources are sparse. The basis vectors of the dictionary are learned via the optimization of the location parameters of the Cauchy mixture components, which is shown to be more accurate and robust than the conventional data mining methods usually employed for this task. Using a well known sparse decomposition algorithm, we extract three speech signals from two mixtures based on the estimated dictionary. Further tests with additive Gaussian noise are used to demonstrate the proposed algorithm-s robustness to outliers.

Keywords: expectation-maximization, Pitman estimator, sparsedecomposition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1907
318 Sparse-View CT Reconstruction Based on Nonconvex L1 − L2 Regularizations

Authors: Ali Pour Yazdanpanah, Farideh Foroozandeh Shahraki, Emma Regentova

Abstract:

The reconstruction from sparse-view projections is one of important problems in computed tomography (CT) limited by the availability or feasibility of obtaining of a large number of projections. Traditionally, convex regularizers have been exploited to improve the reconstruction quality in sparse-view CT, and the convex constraint in those problems leads to an easy optimization process. However, convex regularizers often result in a biased approximation and inaccurate reconstruction in CT problems. Here, we present a nonconvex, Lipschitz continuous and non-smooth regularization model. The CT reconstruction is formulated as a nonconvex constrained L1 − L2 minimization problem and solved through a difference of convex algorithm and alternating direction of multiplier method which generates a better result than L0 or L1 regularizers in the CT reconstruction. We compare our method with previously reported high performance methods which use convex regularizers such as TV, wavelet, curvelet, and curvelet+TV (CTV) on the test phantom images. The results show that there are benefits in using the nonconvex regularizer in the sparse-view CT reconstruction.

Keywords: Computed tomography, sparse-view reconstruction, L1 −L2 minimization, non-convex, difference of convex functions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1961
317 An Analysis of Classification of Imbalanced Datasets by Using Synthetic Minority Over-Sampling Technique

Authors: Ghada A. Alfattni

Abstract:

Analysing unbalanced datasets is one of the challenges that practitioners in machine learning field face. However, many researches have been carried out to determine the effectiveness of the use of the synthetic minority over-sampling technique (SMOTE) to address this issue. The aim of this study was therefore to compare the effectiveness of the SMOTE over different models on unbalanced datasets. Three classification models (Logistic Regression, Support Vector Machine and Nearest Neighbour) were tested with multiple datasets, then the same datasets were oversampled by using SMOTE and applied again to the three models to compare the differences in the performances. Results of experiments show that the highest number of nearest neighbours gives lower values of error rates. 

Keywords: Imbalanced datasets, SMOTE, machine learning, logistic regression, support vector machine, nearest neighbour.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1282
316 A Dictionary Learning Method Based On EMD for Audio Sparse Representation

Authors: Yueming Wang, Zenghui Zhang, Rendong Ying, Peilin Liu

Abstract:

Sparse representation has long been studied and several dictionary learning methods have been proposed. The dictionary learning methods are widely used because they are adaptive. In this paper, a new dictionary learning method for audio is proposed. Signals are at first decomposed into different degrees of Intrinsic Mode Functions (IMF) using Empirical Mode Decomposition (EMD) technique. Then these IMFs form a learned dictionary. To reduce the size of the dictionary, the K-means method is applied to the dictionary to generate a K-EMD dictionary. Compared to K-SVD algorithm, the K-EMD dictionary decomposes audio signals into structured components, thus the sparsity of the representation is increased by 34.4% and the SNR of the recovered audio signals is increased by 20.9%.

Keywords: Dictionary Learning, EMD, K-means Method, Sparse Representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2583
315 Sparsity-Aware and Noise-Robust Subband Adaptive Filter

Authors: Young-Seok Choi

Abstract:

This paper presents a subband adaptive filter (SAF) for a system identification where an impulse response is sparse and disturbed with an impulsive noise. Benefiting from the uses of l1-norm optimization and l0-norm penalty of the weight vector in the cost function, the proposed l0-norm sign SAF (l0-SSAF) achieves both robustness against impulsive noise and much improved convergence behavior than the classical adaptive filters. Simulation results in the system identification scenario confirm that the proposed l0-norm SSAF is not only more robust but also faster and more accurate than its counterparts in the sparse system identification in the presence of impulsive noise.

Keywords: Subband adaptive filter, l0-norm, sparse system, robustness, impulsive interference.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1743
314 Automatic Visualization Pipeline Formation for Medical Datasets on Grid Computing Environment

Authors: Aboamama Atahar Ahmed, Muhammad Shafie Abd Latiff, Kamalrulnizam Abu Bakar, Zainul AhmadRajion

Abstract:

Distance visualization of large datasets often takes the direction of remote viewing and zooming techniques of stored static images. However, the continuous increase in the size of datasets and visualization operation causes insufficient performance with traditional desktop computers. Additionally, the visualization techniques such as Isosurface depend on the available resources of the running machine and the size of datasets. Moreover, the continuous demand for powerful computing powers and continuous increase in the size of datasets results an urgent need for a grid computing infrastructure. However, some issues arise in current grid such as resources availability at the client machines which are not sufficient enough to process large datasets. On top of that, different output devices and different network bandwidth between the visualization pipeline components often result output suitable for one machine and not suitable for another. In this paper we investigate how the grid services could be used to support remote visualization of large datasets and to break the constraint of physical co-location of the resources by applying the grid computing technologies. We show our grid enabled architecture to visualize large medical datasets (circa 5 million polygons) for remote interactive visualization on modest resources clients.

Keywords: Visualization, Grid computing, Medical datasets, visualization techniques, thin clients, Globus toolkit, VTK.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1719
313 The Classification Model for Hard Disk Drive Functional Tests under Sparse Data Conditions

Authors: S. Pattanapairoj, D. Chetchotsak

Abstract:

This paper proposed classification models that would be used as a proxy for hard disk drive (HDD) functional test equitant which required approximately more than two weeks to perform the HDD status classification in either “Pass" or “Fail". These models were constructed by using committee network which consisted of a number of single neural networks. This paper also included the method to solve the problem of sparseness data in failed part, which was called “enforce learning method". Our results reveal that the constructed classification models with the proposed method could perform well in the sparse data conditions and thus the models, which used a few seconds for HDD classification, could be used to substitute the HDD functional tests.

Keywords: Sparse data, Classifications, Committee network

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1696
312 A Sparse Representation Speech Denoising Method Based on Adapted Stopping Residue Error

Authors: Qianhua He, Weili Zhou, Aiwu Chen

Abstract:

A sparse representation speech denoising method based on adapted stopping residue error was presented in this paper. Firstly, the cross-correlation between the clean speech spectrum and the noise spectrum was analyzed, and an estimation method was proposed. In the denoising method, an over-complete dictionary of the clean speech power spectrum was learned with the K-singular value decomposition (K-SVD) algorithm. In the sparse representation stage, the stopping residue error was adaptively achieved according to the estimated cross-correlation and the adjusted noise spectrum, and the orthogonal matching pursuit (OMP) approach was applied to reconstruct the clean speech spectrum from the noisy speech. Finally, the clean speech was re-synthesised via the inverse Fourier transform with the reconstructed speech spectrum and the noisy speech phase. The experiment results show that the proposed method outperforms the conventional methods in terms of subjective and objective measure.

Keywords: Speech denoising, sparse representation, K-singular value decomposition, orthogonal matching pursuit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 972
311 An Algorithm for Computing the Analytic Singular Value Decomposition

Authors: Drahoslava Janovska, Vladimir Janovsky, Kunio Tanabe

Abstract:

A proof of convergence of a new continuation algorithm for computing the Analytic SVD for a large sparse parameter– dependent matrix is given. The algorithm itself was developed and numerically tested in [5].

Keywords: Analytic Singular Value Decomposition, large sparse parameter–dependent matrices, continuation algorithm of a predictorcorrector type.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1416
310 Improving Classification Accuracy with Discretization on Datasets Including Continuous Valued Features

Authors: Mehmet Hacibeyoglu, Ahmet Arslan, Sirzat Kahramanli

Abstract:

This study analyzes the effect of discretization on classification of datasets including continuous valued features. Six datasets from UCI which containing continuous valued features are discretized with entropy-based discretization method. The performance improvement between the dataset with original features and the dataset with discretized features is compared with k-nearest neighbors, Naive Bayes, C4.5 and CN2 data mining classification algorithms. As the result the classification accuracies of the six datasets are improved averagely by 1.71% to 12.31%.

Keywords: Data mining classification algorithms, entropy-baseddiscretization method

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2422
309 Compressed Sensing of Fetal Electrocardiogram Signals Based on Joint Block Multi-Orthogonal Least Squares Algorithm

Authors: Xiang Jianhong, Wang Cong, Wang Linyu

Abstract:

With the rise of medical IoT technologies, Wireless body area networks (WBANs) can collect fetal electrocardiogram (FECG) signals to support telemedicine analysis. The compressed sensing (CS)-based WBANs system can avoid the sampling of a large amount of redundant information and reduce the complexity and computing time of data processing, but the existing algorithms have poor signal compression and reconstruction performance. In this paper, a Joint block multi-orthogonal least squares (JBMOLS) algorithm is proposed. We apply the FECG signal to the Joint block sparse model (JBSM), and a comparative study of sparse transformation and measurement matrices is carried out. A FECG signal compression transmission mode based on Rbio5.5 wavelet, Bernoulli measurement matrix, and JBMOLS algorithm is proposed to improve the compression and reconstruction performance of FECG signal by CS-based WBANs. Experimental results show that the compression ratio (CR) required for accurate reconstruction of this transmission mode is increased by nearly 10%, and the runtime is saved by about 30%.

Keywords: telemedicine, fetal electrocardiogram, compressed sensing, joint sparse reconstruction, block sparse signal

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 411
308 Sparse Unmixing of Hyperspectral Data by Exploiting Joint-Sparsity and Rank-Deficiency

Authors: Fanqiang Kong, Chending Bian

Abstract:

In this work, we exploit two assumed properties of the abundances of the observed signatures (endmembers) in order to reconstruct the abundances from hyperspectral data. Joint-sparsity is the first property of the abundances, which assumes the adjacent pixels can be expressed as different linear combinations of same materials. The second property is rank-deficiency where the number of endmembers participating in hyperspectral data is very small compared with the dimensionality of spectral library, which means that the abundances matrix of the endmembers is a low-rank matrix. These assumptions lead to an optimization problem for the sparse unmixing model that requires minimizing a combined l2,p-norm and nuclear norm. We propose a variable splitting and augmented Lagrangian algorithm to solve the optimization problem. Experimental evaluation carried out on synthetic and real hyperspectral data shows that the proposed method outperforms the state-of-the-art algorithms with a better spectral unmixing accuracy.

Keywords: Hyperspectral unmixing, joint-sparse, low-rank representation, abundance estimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 734
307 A Survey in Techniques for Imbalanced Intrusion Detection System Datasets

Authors: Najmeh Abedzadeh, Matthew Jacobs

Abstract:

An intrusion detection system (IDS) is a software application that monitors malicious activities and generates alerts if any are detected. However, most network activities in IDS datasets are normal, and the relatively few numbers of attacks make the available data imbalanced. Consequently, cyber-attacks can hide inside a large number of normal activities, and machine learning algorithms have difficulty learning and classifying the data correctly. In this paper, a comprehensive literature review is conducted on different types of algorithms for both implementing the IDS and methods in correcting the imbalanced IDS dataset. The most famous algorithms are machine learning (ML), deep learning (DL), synthetic minority over-sampling technique (SMOTE), and reinforcement learning (RL). Most of the research use the CSE-CIC-IDS2017, CSE-CIC-IDS2018, and NSL-KDD datasets for evaluating their algorithms.

Keywords: IDS, intrusion detection system, imbalanced datasets, sampling algorithms, big data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1019
306 A Robust LS-SVM Regression

Authors: József Valyon, Gábor Horváth

Abstract:

In comparison to the original SVM, which involves a quadratic programming task; LS–SVM simplifies the required computation, but unfortunately the sparseness of standard SVM is lost. Another problem is that LS-SVM is only optimal if the training samples are corrupted by Gaussian noise. In Least Squares SVM (LS–SVM), the nonlinear solution is obtained, by first mapping the input vector to a high dimensional kernel space in a nonlinear fashion, where the solution is calculated from a linear equation set. In this paper a geometric view of the kernel space is introduced, which enables us to develop a new formulation to achieve a sparse and robust estimate.

Keywords: Support Vector Machines, Least Squares SupportVector Machines, Regression, Sparse approximation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2028
305 A Generalized Sparse Bayesian Learning Algorithm for Near-Field Synthetic Aperture Radar Imaging: By Exploiting Impropriety and Noncircularity

Authors: Pan Long, Bi Dongjie, Li Xifeng, Xie Yongle

Abstract:

The near-field synthetic aperture radar (SAR) imaging is an advanced nondestructive testing and evaluation (NDT&E) technique. This paper investigates the complex-valued signal processing related to the near-field SAR imaging system, where the measurement data turns out to be noncircular and improper, meaning that the complex-valued data is correlated to its complex conjugate. Furthermore, we discover that the degree of impropriety of the measurement data and that of the target image can be highly correlated in near-field SAR imaging. Based on these observations, A modified generalized sparse Bayesian learning algorithm is proposed, taking impropriety and noncircularity into account. Numerical results show that the proposed algorithm provides performance gain, with the help of noncircular assumption on the signals.

Keywords: Complex-valued signal processing, synthetic aperture radar (SAR), 2-D radar imaging, compressive sensing, Sparse Bayesian learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1441
304 Visual Analytics of Higher Order Information for Trajectory Datasets

Authors: Ye Wang, Ickjai Lee

Abstract:

Due to the widespread of mobile sensing, there is a strong need to handle trails of moving objects, and trajectories. This paper proposes three visual analytics approaches for higher order information of trajectory datasets based on the higher order Voronoi diagram data structure. Proposed approaches reveal geometrical, topological, and directional information. Experimental resultsdemonstrate the applicability and usefulness of proposed three approaches.

Keywords: Visual Analytics, Higher Order Information, Trajectory Datasets, Spatio-temporal data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1597
303 Sparse Networks-Based Speedup Technique for Proteins Betweenness Centrality Computation

Authors: Razvan Bocu, Dr Sabin Tabirca

Abstract:

The study of proteomics reached unexpected levels of interest, as a direct consequence of its discovered influence over some complex biological phenomena, such as problematic diseases like cancer. This paper presents the latest authors- achievements regarding the analysis of the networks of proteins (interactome networks), by computing more efficiently the betweenness centrality measure. The paper introduces the concept of betweenness centrality, and then describes how betweenness computation can help the interactome net- work analysis. Current sequential implementations for the between- ness computation do not perform satisfactory in terms of execution times. The paper-s main contribution is centered towards introducing a speedup technique for the betweenness computation, based on modified shortest path algorithms for sparse graphs. Three optimized generic algorithms for betweenness computation are described and implemented, and their performance tested against real biological data, which is part of the IntAct dataset.

Keywords: Betweenness centrality, interactome networks, protein-protein interactions, sub-communities, sparse networks, speedup tech-nique, IntAct.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1466
302 A Prototype of Augmented Reality for Visualising Large Sensors’ Datasets

Authors: Folorunso Olufemi Ayinde, Mohd Shahrizal Sunar, Sarudin Kari, Dzulkifli Mohamad

Abstract:

In this paper we discuss the development of an Augmented Reality (AR) - based scientific visualization system prototype that supports identification, localisation, and 3D visualisation of oil leakages sensors datasets. Sensors generates significant amount of multivariate datasets during normal and leak situations. Therefore we have developed a data model to effectively manage such data and enhance the computational support needed for the effective data explorations. A challenge of this approach is to reduce the data inefficiency powered by the disparate, repeated, inconsistent and missing attributes of most available sensors datasets. To handle this challenge, this paper aim to develop an AR-based scientific visualization interface which automatically identifies, localise and visualizes all necessary data relevant to a particularly selected region of interest (ROI) along the virtual pipeline network. Necessary system architectural supports needed as well as the interface requirements for such visualizations are also discussed in this paper.

Keywords: Sensor Leakages Datasets, Augmented Reality, Sensor Data-Model, Scientific Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1620
301 Sparsity-Aware Affine Projection Algorithm for System Identification

Authors: Young-Seok Choi

Abstract:

This work presents a new type of the affine projection (AP) algorithms which incorporate the sparsity condition of a system. To exploit the sparsity of the system, a weighted l1-norm regularization is imposed on the cost function of the AP algorithm. Minimizing the cost function with a subgradient calculus and choosing two distinct weighting for l1-norm, two stochastic gradient based sparsity regularized AP (SR-AP) algorithms are developed. Experimental results exhibit that the SR-AP algorithms outperform the typical AP counterparts for identifying sparse systems.

Keywords: System identification, adaptive filter, affine projection, sparsity, sparse system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1507
300 Performance Analysis and Optimization for Diagonal Sparse Matrix-Vector Multiplication on Machine Learning Unit

Authors: Qiuyu Dai, Haochong Zhang, Xiangrong Liu

Abstract:

Efficient matrix-vector multiplication with diagonal sparse matrices is pivotal in a multitude of computational domains, ranging from scientific simulations to machine learning workloads. When encoded in the conventional Diagonal (DIA) format, these matrices often induce computational overheads due to extensive zero-padding and non-linear memory accesses, which can hamper the computational throughput, and elevate the usage of precious compute and memory resources beyond necessity. The ’DIA-Adaptive’ approach, a methodological enhancement introduced in this paper, confronts these challenges head-on by leveraging the advanced parallel instruction sets embedded within Machine Learning Units (MLUs). This research presents a thorough analysis of the DIA-Adaptive scheme’s efficacy in optimizing Sparse Matrix-Vector Multiplication (SpMV) operations. The scope of the evaluation extends to a variety of hardware architectures, examining the repercussions of distinct thread allocation strategies and cluster configurations across multiple storage formats. A dedicated computational kernel, intrinsic to the DIA-Adaptive approach, has been meticulously developed to synchronize with the nuanced performance characteristics of MLUs. Empirical results, derived from rigorous experimentation, reveal that the DIA-Adaptive methodology not only diminishes the performance bottlenecks associated with the DIA format but also exhibits pronounced enhancements in execution speed and resource utilization. The analysis delineates a marked improvement in parallelism, showcasing the DIA-Adaptive scheme’s ability to adeptly manage the interplay between storage formats, hardware capabilities, and algorithmic design. The findings suggest that this approach could set a precedent for accelerating SpMV tasks, thereby contributing significantly to the broader domain of high-performance computing and data-intensive applications.

Keywords: Adaptive method, DIA, diagonal sparse matrices, MLU, sparse matrix-vector multiplication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 138
299 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: Clustering, k-means, categorical datasets, pattern recognition, unsupervised learning, knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3494
298 A Fast HRRP Synthesis Algorithm with Sensing Dictionary in GTD Model

Authors: R. Fan, Q. Wan, H. Chen, Y.L. Liu, Y.P. Liu

Abstract:

In the paper, a fast high-resolution range profile synthetic algorithm called orthogonal matching pursuit with sensing dictionary (OMP-SD) is proposed. It formulates the traditional HRRP synthetic to be a sparse approximation problem over redundant dictionary. As it employs a priori that the synthetic range profile (SRP) of targets are sparse, SRP can be accomplished even in presence of data lost. Besides, the computation complexity decreases from O(MNDK) flops for OMP to O(M(N + D)K) flops for OMP-SD by introducing sensing dictionary (SD). Simulation experiments illustrate its advantages both in additive white Gaussian noise (AWGN) and noiseless situation, respectively.

Keywords: GTD-based model, HRRP, orthogonal matching pursuit, sensing dictionary.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1873
297 A Hybrid Recommender System based on Collaborative Filtering and Cloud Model

Authors: Chein-Shung Hwang, Ruei-Siang Fong

Abstract:

User-based Collaborative filtering (CF), one of the most prevailing and efficient recommendation techniques, provides personalized recommendations to users based on the opinions of other users. Although the CF technique has been successfully applied in various applications, it suffers from serious sparsity problems. The cloud-model approach addresses the sparsity problems by constructing the user-s global preference represented by a cloud eigenvector. The user-based CF approach works well with dense datasets while the cloud-model CF approach has a greater performance when the dataset is sparse. In this paper, we present a hybrid approach that integrates the predictions from both the user-based CF and the cloud-model CF approaches. The experimental results show that the proposed hybrid approach can ameliorate the sparsity problem and provide an improved prediction quality.

Keywords: Cloud model, Collaborative filtering, Hybridrecommender system

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1909