Search results for: sparsity
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24

Search results for: sparsity

24 Sparse Signal Restoration Algorithm Based on Piecewise Adaptive Backtracking Orthogonal Least Squares

Authors: Linyu Wang, Jiahui Ma, Jianhong Xiang, Hanyu Jiang

Abstract:

the traditional greedy compressed sensing algorithm needs to know the signal sparsity when recovering the signal, but the signal sparsity in the practical application can not be obtained as a priori information, and the recovery accuracy is low, which does not meet the needs of practical application. To solve this problem, this paper puts forward Piecewise adaptive backtracking orthogonal least squares algorithm. The algorithm is divided into two stages. In the first stage, the sparsity pre-estimation strategy is adopted, which can quickly approach the real sparsity and reduce time consumption. In the second stage iteration, the correction strategy and adaptive step size are used to accurately estimate the sparsity, and the backtracking idea is introduced to improve the accuracy of signal recovery. Through experimental simulation, the algorithm can accurately recover the estimated signal with fewer iterations when the sparsity is unknown.

Keywords: compressed sensing, greedy algorithm, least square method, adaptive reconstruction

Procedia PDF Downloads 109
23 Sparsity Order Selection and Denoising in Compressed Sensing Framework

Authors: Mahdi Shamsi, Tohid Yousefi Rezaii, Siavash Eftekharifar

Abstract:

Compressed sensing (CS) is a new powerful mathematical theory concentrating on sparse signals which is widely used in signal processing. The main idea is to sense sparse signals by far fewer measurements than the Nyquist sampling rate, but the reconstruction process becomes nonlinear and more complicated. Common dilemma in sparse signal recovery in CS is the lack of knowledge about sparsity order of the signal, which can be viewed as model order selection procedure. In this paper, we address the problem of sparsity order estimation in sparse signal recovery. This is of main interest in situations where the signal sparsity is unknown or the signal to be recovered is approximately sparse. It is shown that the proposed method also leads to some kind of signal denoising, where the observations are contaminated with noise. Finally, the performance of the proposed approach is evaluated in different scenarios and compared to an existing method, which shows the effectiveness of the proposed method in terms of order selection as well as denoising.

Keywords: compressed sensing, data denoising, model order selection, sparse representation

Procedia PDF Downloads 448
22 Sparsity-Based Unsupervised Unmixing of Hyperspectral Imaging Data Using Basis Pursuit

Authors: Ahmed Elrewainy

Abstract:

Mixing in the hyperspectral imaging occurs due to the low spatial resolutions of the used cameras. The existing pure materials “endmembers” in the scene share the spectra pixels with different amounts called “abundances”. Unmixing of the data cube is an important task to know the present endmembers in the cube for the analysis of these images. Unsupervised unmixing is done with no information about the given data cube. Sparsity is one of the recent approaches used in the source recovery or unmixing techniques. The l1-norm optimization problem “basis pursuit” could be used as a sparsity-based approach to solve this unmixing problem where the endmembers is assumed to be sparse in an appropriate domain known as dictionary. This optimization problem is solved using proximal method “iterative thresholding”. The l1-norm basis pursuit optimization problem as a sparsity-based unmixing technique was used to unmix real and synthetic hyperspectral data cubes.

Keywords: basis pursuit, blind source separation, hyperspectral imaging, spectral unmixing, wavelets

Procedia PDF Downloads 172
21 Talent-to-Vec: Using Network Graphs to Validate Models with Data Sparsity

Authors: Shaan Khosla, Jon Krohn

Abstract:

In a recruiting context, machine learning models are valuable for recommendations: to predict the best candidates for a vacancy, to match the best vacancies for a candidate, and compile a set of similar candidates for any given candidate. While useful to create these models, validating their accuracy in a recommendation context is difficult due to a sparsity of data. In this report, we use network graph data to generate useful representations for candidates and vacancies. We use candidates and vacancies as network nodes and designate a bi-directional link between them based on the candidate interviewing for the vacancy. After using node2vec, the embeddings are used to construct a validation dataset with a ranked order, which will help validate new recommender systems.

Keywords: AI, machine learning, NLP, recruiting

Procedia PDF Downloads 53
20 Sparse Unmixing of Hyperspectral Data by Exploiting Joint-Sparsity and Rank-Deficiency

Authors: Fanqiang Kong, Chending Bian

Abstract:

In this work, we exploit two assumed properties of the abundances of the observed signatures (endmembers) in order to reconstruct the abundances from hyperspectral data. Joint-sparsity is the first property of the abundances, which assumes the adjacent pixels can be expressed as different linear combinations of same materials. The second property is rank-deficiency where the number of endmembers participating in hyperspectral data is very small compared with the dimensionality of spectral library, which means that the abundances matrix of the endmembers is a low-rank matrix. These assumptions lead to an optimization problem for the sparse unmixing model that requires minimizing a combined l2,p-norm and nuclear norm. We propose a variable splitting and augmented Lagrangian algorithm to solve the optimization problem. Experimental evaluation carried out on synthetic and real hyperspectral data shows that the proposed method outperforms the state-of-the-art algorithms with a better spectral unmixing accuracy.

Keywords: hyperspectral unmixing, joint-sparse, low-rank representation, abundance estimation

Procedia PDF Downloads 214
19 Building Scalable and Accurate Hybrid Kernel Mapping Recommender

Authors: Hina Iqbal, Mustansar Ali Ghazanfar, Sandor Szedmak

Abstract:

Recommender systems uses artificial intelligence practices for filtering obscure information and can predict if a user likes a specified item. Kernel mapping Recommender systems have been proposed which are accurate and state-of-the-art algorithms and resolve recommender system’s design objectives such as; long tail, cold-start, and sparsity. The aim of research is to propose hybrid framework that can efficiently integrate different versions— namely item-based and user-based KMR— of KMR algorithm. We have proposed various heuristic algorithms that integrate different versions of KMR (into a unified framework) resulting in improved accuracy and elimination of problems associated with conventional recommender system. We have tested our system on publically available movies dataset and benchmark with KMR. The results (in terms of accuracy, precision, recall, F1 measure and ROC metrics) reveal that the proposed algorithm is quite accurate especially under cold-start and sparse scenarios.

Keywords: Kernel Mapping Recommender Systems, hybrid recommender systems, cold start, sparsity, long tail

Procedia PDF Downloads 306
18 Unsupervised Learning of Spatiotemporally Coherent Metrics

Authors: Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann LeCun

Abstract:

Current state-of-the-art classification and detection algorithms rely on supervised training. In this work we study unsupervised feature learning in the context of temporally coherent video data. We focus on feature learning from unlabeled video data, using the assumption that adjacent video frames contain semantically similar information. This assumption is exploited to train a convolutional pooling auto-encoder regularized by slowness and sparsity. We establish a connection between slow feature learning to metric learning and show that the trained encoder can be used to define a more temporally and semantically coherent metric.

Keywords: machine learning, pattern clustering, pooling, classification

Procedia PDF Downloads 419
17 Sparse Principal Component Analysis: A Least Squares Approximation Approach

Authors: Giovanni Merola

Abstract:

Sparse Principal Components Analysis aims to find principal components with few non-zero loadings. We derive such sparse solutions by adding a genuine sparsity requirement to the original Principal Components Analysis (PCA) objective function. This approach differs from others because it preserves PCA's original optimality: uncorrelatedness of the components and least squares approximation of the data. To identify the best subset of non-zero loadings we propose a branch-and-bound search and an iterative elimination algorithm. This last algorithm finds sparse solutions with large loadings and can be run without specifying the cardinality of the loadings and the number of components to compute in advance. We give thorough comparisons with the existing sparse PCA methods and several examples on real datasets.

Keywords: SPCA, uncorrelated components, branch-and-bound, backward elimination

Procedia PDF Downloads 339
16 On Direct Matrix Factored Inversion via Broyden's Updates

Authors: Adel Mohsen

Abstract:

A direct method based on the good Broyden's updates for evaluating the inverse of a nonsingular square matrix of full rank and solving related system of linear algebraic equations is studied. For a matrix A of order n whose LU-decomposition is A = LU, the multiplication count is O (n3). This includes the evaluation of the LU-decompositions of the inverse, the lower triangular decomposition of A as well as a “reduced matrix inverse”. If an explicit value of the inverse is not needed the order reduces to O (n3/2) to compute to compute inv(U) and the reduced inverse. For a symmetric matrix only O (n3/3) operations are required to compute inv(L) and the reduced inverse. An example is presented to demonstrate the capability of using the reduced matrix inverse in treating ill-conditioned systems. Besides the simplicity of Broyden's update, the method provides a mean to exploit the possible sparsity in the matrix and to derive a suitable preconditioner.

Keywords: Broyden's updates, matrix inverse, inverse factorization, solution of linear algebraic equations, ill-conditioned matrices, preconditioning

Procedia PDF Downloads 446
15 Scalable Learning of Tree-Based Models on Sparsely Representable Data

Authors: Fares Hedayatit, Arnauld Joly, Panagiotis Papadimitriou

Abstract:

Many machine learning tasks such as text annotation usually require training over very big datasets, e.g., millions of web documents, that can be represented in a sparse input space. State-of the-art tree-based ensemble algorithms cannot scale to such datasets, since they include operations whose running time is a function of the input space size rather than a function of the non-zero input elements. In this paper, we propose an efficient splitting algorithm to leverage input sparsity within decision tree methods. Our algorithm improves training time over sparse datasets by more than two orders of magnitude and it has been incorporated in the current version of scikit-learn.org, the most popular open source Python machine learning library.

Keywords: big data, sparsely representable data, tree-based models, scalable learning

Procedia PDF Downloads 229
14 Bridging the Data Gap for Sexism Detection in Twitter: A Semi-Supervised Approach

Authors: Adeep Hande, Shubham Agarwal

Abstract:

This paper presents a study on identifying sexism in online texts using various state-of-the-art deep learning models based on BERT. We experimented with different feature sets and model architectures and evaluated their performance using precision, recall, F1 score, and accuracy metrics. We also explored the use of pseudolabeling technique to improve model performance. Our experiments show that the best-performing models were based on BERT, and their multilingual model achieved an F1 score of 0.83. Furthermore, the use of pseudolabeling significantly improved the performance of the BERT-based models, with the best results achieved using the pseudolabeling technique. Our findings suggest that BERT-based models with pseudolabeling hold great promise for identifying sexism in online texts with high accuracy.

Keywords: large language models, semi-supervised learning, sexism detection, data sparsity

Procedia PDF Downloads 34
13 A Quantitative Evaluation of Text Feature Selection Methods

Authors: B. S. Harish, M. B. Revanasiddappa

Abstract:

Due to rapid growth of text documents in digital form, automated text classification has become an important research in the last two decades. The major challenge of text document representations are high dimension, sparsity, volume and semantics. Since the terms are only features that can be found in documents, selection of good terms (features) plays an very important role. In text classification, feature selection is a strategy that can be used to improve classification effectiveness, computational efficiency and accuracy. In this paper, we present a quantitative analysis of most widely used feature selection (FS) methods, viz. Term Frequency-Inverse Document Frequency (tfidf ), Mutual Information (MI), Information Gain (IG), CHISquare (x2), Term Frequency-Relevance Frequency (tfrf ), Term Strength (TS), Ambiguity Measure (AM) and Symbolic Feature Selection (SFS) to classify text documents. We evaluated all the feature selection methods on standard datasets like 20 Newsgroups, 4 University dataset and Reuters-21578.

Keywords: classifiers, feature selection, text classification

Procedia PDF Downloads 419
12 Sentiment Classification of Documents

Authors: Swarnadip Ghosh

Abstract:

Sentiment Analysis is the process of detecting the contextual polarity of text. In other words, it determines whether a piece of writing is positive, negative or neutral.Sentiment analysis of documents holds great importance in today's world, when numerous information is stored in databases and in the world wide web. An efficient algorithm to illicit such information, would be beneficial for social, economic as well as medical purposes. In this project, we have developed an algorithm to classify a document into positive or negative. Using our algorithm, we obtained a feature set from the data, and classified the documents based on this feature set. It is important to note that, in the classification, we have not used the independence assumption, which is considered by many procedures like the Naive Bayes. This makes the algorithm more general in scope. Moreover, because of the sparsity and high dimensionality of such data, we did not use empirical distribution for estimation, but developed a method by finding degree of close clustering of the data points. We have applied our algorithm on a movie review data set obtained from IMDb and obtained satisfactory results.

Keywords: sentiment, Run's Test, cross validation, higher dimensional pmf estimation

Procedia PDF Downloads 369
11 HR MRI CS Based Image Reconstruction

Authors: Krzysztof Malczewski

Abstract:

Magnetic Resonance Imaging (MRI) reconstruction algorithm using compressed sensing is presented in this paper. It is exhibited that the offered approach improves MR images spatial resolution in circumstances when highly undersampled k-space trajectories are applied. Compressed Sensing (CS) aims at signal and images reconstructing from significantly fewer measurements than were conventionally assumed necessary. Magnetic Resonance Imaging (MRI) is a fundamental medical imaging method struggles with an inherently slow data acquisition process. The use of CS to MRI has the potential for significant scan time reductions, with visible benefits for patients and health care economics. In this study the objective is to combine super-resolution image enhancement algorithm with CS framework benefits to achieve high resolution MR output image. Both methods emphasize on maximizing image sparsity on known sparse transform domain and minimizing fidelity. The presented algorithm considers the cardiac and respiratory movements.

Keywords: super-resolution, MRI, compressed sensing, sparse-sense, image enhancement

Procedia PDF Downloads 394
10 Spherical Harmonic Based Monostatic Anisotropic Point Scatterer Model for RADAR Applications

Authors: Eric Huang, Coleman DeLude, Justin Romberg, Saibal Mukhopadhyay, Madhavan Swaminathan

Abstract:

High performance computing (HPC) based emulators can be used to model the scattering from multiple stationary and moving targets for RADAR applications. These emulators rely on the RADAR Cross Section (RCS) of the targets being available in complex scenarios. Representing the RCS using tables generated from electromagnetic (EM) simulations is often times cumbersome leading to large storage requirement. This paper proposed a spherical harmonic based anisotropic scatterer model to represent the RCS of complex targets. The problem of finding the locations and reflection profiles of all scatterers can be formulated as a linear least square problem with a special sparsity constraint. This paper solves this problem using a modified Orthogonal Matching Pursuit algorithm. The results show that the spherical harmonic based scatterer model can effectively represent the RCS data of complex targets.

Keywords: RADAR, RCS, high performance computing, point scatterer model

Procedia PDF Downloads 162
9 A Transform Domain Function Controlled VSSLMS Algorithm for Sparse System Identification

Authors: Cemil Turan, Mohammad Shukri Salman

Abstract:

The convergence rate of the least-mean-square (LMS) algorithm deteriorates if the input signal to the filter is correlated. In a system identification problem, this convergence rate can be improved if the signal is white and/or if the system is sparse. We recently proposed a sparse transform domain LMS-type algorithm that uses a variable step-size for a sparse system identification. The proposed algorithm provided high performance even if the input signal is highly correlated. In this work, we investigate the performance of the proposed TD-LMS algorithm for a large number of filter tap which is also a critical issue for standard LMS algorithm. Additionally, the optimum value of the most important parameter is calculated for all experiments. Moreover, the convergence analysis of the proposed algorithm is provided. The performance of the proposed algorithm has been compared to different algorithms in a sparse system identification setting of different sparsity levels and different number of filter taps. Simulations have shown that the proposed algorithm has prominent performance compared to the other algorithms.

Keywords: adaptive filtering, sparse system identification, TD-LMS algorithm, VSSLMS algorithm

Procedia PDF Downloads 321
8 Novel Recommender Systems Using Hybrid CF and Social Network Information

Authors: Kyoung-Jae Kim

Abstract:

Collaborative Filtering (CF) is a popular technique for the personalization in the E-commerce domain to reduce information overload. In general, CF provides recommending items list based on other similar users’ preferences from the user-item matrix and predicts the focal user’s preference for particular items by using them. Many recommender systems in real-world use CF techniques because it’s excellent accuracy and robustness. However, it has some limitations including sparsity problems and complex dimensionality in a user-item matrix. In addition, traditional CF does not consider the emotional interaction between users. In this study, we propose recommender systems using social network and singular value decomposition (SVD) to alleviate some limitations. The purpose of this study is to reduce the dimensionality of data set using SVD and to improve the performance of CF by using emotional information from social network data of the focal user. In this study, we test the usability of hybrid CF, SVD and social network information model using the real-world data. The experimental results show that the proposed model outperforms conventional CF models.

Keywords: recommender systems, collaborative filtering, social network information, singular value decomposition

Procedia PDF Downloads 253
7 System Identification in Presence of Outliers

Authors: Chao Yu, Qing-Guo Wang, Dan Zhang

Abstract:

The outlier detection problem for dynamic systems is formulated as a matrix decomposition problem with low-rank, sparse matrices and further recast as a semidefinite programming (SDP) problem. A fast algorithm is presented to solve the resulting problem while keeping the solution matrix structure and it can greatly reduce the computational cost over the standard interior-point method. The computational burden is further reduced by proper construction of subsets of the raw data without violating low rank property of the involved matrix. The proposed method can make exact detection of outliers in case of no or little noise in output observations. In case of significant noise, a novel approach based on under-sampling with averaging is developed to denoise while retaining the saliency of outliers and so-filtered data enables successful outlier detection with the proposed method while the existing filtering methods fail. Use of recovered “clean” data from the proposed method can give much better parameter estimation compared with that based on the raw data.

Keywords: outlier detection, system identification, matrix decomposition, low-rank matrix, sparsity, semidefinite programming, interior-point methods, denoising

Procedia PDF Downloads 281
6 Neural Machine Translation for Low-Resource African Languages: Benchmarking State-of-the-Art Transformer for Wolof

Authors: Cheikh Bamba Dione, Alla Lo, Elhadji Mamadou Nguer, Siley O. Ba

Abstract:

In this paper, we propose two neural machine translation (NMT) systems (French-to-Wolof and Wolof-to-French) based on sequence-to-sequence with attention and transformer architectures. We trained our models on a parallel French-Wolof corpus of about 83k sentence pairs. Because of the low-resource setting, we experimented with advanced methods for handling data sparsity, including subword segmentation, back translation, and the copied corpus method. We evaluate the models using the BLEU score and find that transformer outperforms the classic seq2seq model in all settings, in addition to being less sensitive to noise. In general, the best scores are achieved when training the models on word-level-based units. For subword-level models, using back translation proves to be slightly beneficial in low-resource (WO) to high-resource (FR) language translation for the transformer (but not for the seq2seq) models. A slight improvement can also be observed when injecting copied monolingual text in the target language. Moreover, combining the copied method data with back translation leads to a substantial improvement of the translation quality.

Keywords: backtranslation, low-resource language, neural machine translation, sequence-to-sequence, transformer, Wolof

Procedia PDF Downloads 110
5 PET Image Resolution Enhancement

Authors: Krzysztof Malczewski

Abstract:

PET is widely applied scanning procedure in medical imaging based research. It delivers measurements of functioning in distinct areas of the human brain while the patient is comfortable, conscious and alert. This article presents the new compression sensing based super-resolution algorithm for improving the image resolution in clinical Positron Emission Tomography (PET) scanners. The issue of motion artifacts is well known in Positron Emission Tomography (PET) studies as its side effect. The PET images are being acquired over a limited period of time. As the patients cannot hold breath during the PET data gathering, spatial blurring and motion artefacts are the usual result. These may lead to wrong diagnosis. It is shown that the presented approach improves PET spatial resolution in cases when Compressed Sensing (CS) sequences are used. Compressed Sensing (CS) aims at signal and images reconstructing from significantly fewer measurements than were traditionally thought necessary. The application of CS to PET has the potential for significant scan time reductions, with visible benefits for patients and health care economics. In this study the goal is to combine super-resolution image enhancement algorithm with CS framework to achieve high resolution PET output. Both methods emphasize on maximizing image sparsity on known sparse transform domain and minimizing fidelity.

Keywords: PET, super-resolution, image reconstruction, pattern recognition

Procedia PDF Downloads 339
4 Cirrhosis Mortality Prediction as Classification using Frequent Subgraph Mining

Authors: Abdolghani Ebrahimi, Diego Klabjan, Chenxi Ge, Daniela Ladner, Parker Stride

Abstract:

In this work, we use machine learning and novel data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. To the best of our knowledge, this is the first work to apply modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

Keywords: machine learning, liver cirrhosis, subgraph mining, supervised learning

Procedia PDF Downloads 106
3 Atomic Decomposition Audio Data Compression and Denoising Using Sparse Dictionary Feature Learning

Authors: T. Bryan , V. Kepuska, I. Kostnaic

Abstract:

A method of data compression and denoising is introduced that is based on atomic decomposition of audio data using “basis vectors” that are learned from the audio data itself. The basis vectors are shown to have higher data compression and better signal-to-noise enhancement than the Gabor and gammatone “seed atoms” that were used to generate them. The basis vectors are the input weights of a Sparse AutoEncoder (SAE) that is trained using “envelope samples” of windowed segments of the audio data. The envelope samples are extracted from the audio data by performing atomic decomposition with Gabor or gammatone seed atoms. This process identifies segments of audio data that are locally coherent with the seed atoms. Envelope samples are extracted by identifying locally coherent audio data segments with Gabor or gammatone seed atoms, found by matching pursuit. The envelope samples are formed by taking the kronecker products of the atomic envelopes with the locally coherent data segments. Oracle signal-to-noise ratio (SNR) verses data compression curves are generated for the seed atoms as well as the basis vectors learned from Gabor and gammatone seed atoms. SNR data compression curves are generated for speech signals as well as early American music recordings. The basis vectors are shown to have higher denoising capability for data compression rates ranging from 90% to 99.84% for speech as well as music. Envelope samples are displayed as images by folding the time series into column vectors. This display method is used to compare of the output of the SAE with the envelope samples that produced them. The basis vectors are also displayed as images. Sparsity is shown to play an important role in producing the highest denoising basis vectors.

Keywords: sparse dictionary learning, autoencoder, sparse autoencoder, basis vectors, atomic decomposition, envelope sampling, envelope samples, Gabor, gammatone, matching pursuit

Procedia PDF Downloads 221
2 Event Data Representation Based on Time Stamp for Pedestrian Detection

Authors: Yuta Nakano, Kozo Kajiwara, Atsushi Hori, Takeshi Fujita

Abstract:

In association with the wave of electric vehicles (EV), low energy consumption systems have become more and more important. One of the key technologies to realize low energy consumption is a dynamic vision sensor (DVS), or we can call it an event sensor, neuromorphic vision sensor and so on. This sensor has several features, such as high temporal resolution, which can achieve 1 Mframe/s, and a high dynamic range (120 DB). However, the point that can contribute to low energy consumption the most is its sparsity; to be more specific, this sensor only captures the pixels that have intensity change. In other words, there is no signal in the area that does not have any intensity change. That is to say, this sensor is more energy efficient than conventional sensors such as RGB cameras because we can remove redundant data. On the other side of the advantages, it is difficult to handle the data because the data format is completely different from RGB image; for example, acquired signals are asynchronous and sparse, and each signal is composed of x-y coordinate, polarity (two values: +1 or -1) and time stamp, it does not include intensity such as RGB values. Therefore, as we cannot use existing algorithms straightforwardly, we have to design a new processing algorithm to cope with DVS data. In order to solve difficulties caused by data format differences, most of the prior arts make a frame data and feed it to deep learning such as Convolutional Neural Networks (CNN) for object detection and recognition purposes. However, even though we can feed the data, it is still difficult to achieve good performance due to a lack of intensity information. Although polarity is often used as intensity instead of RGB pixel value, it is apparent that polarity information is not rich enough. Considering this context, we proposed to use the timestamp information as a data representation that is fed to deep learning. Concretely, at first, we also make frame data divided by a certain time period, then give intensity value in response to the timestamp in each frame; for example, a high value is given on a recent signal. We expected that this data representation could capture the features, especially of moving objects, because timestamp represents the movement direction and speed. By using this proposal method, we made our own dataset by DVS fixed on a parked car to develop an application for a surveillance system that can detect persons around the car. We think DVS is one of the ideal sensors for surveillance purposes because this sensor can run for a long time with low energy consumption in a NOT dynamic situation. For comparison purposes, we reproduced state of the art method as a benchmark, which makes frames the same as us and feeds polarity information to CNN. Then, we measured the object detection performances of the benchmark and ours on the same dataset. As a result, our method achieved a maximum of 7 points greater than the benchmark in the F1 score.

Keywords: event camera, dynamic vision sensor, deep learning, data representation, object recognition, low energy consumption

Procedia PDF Downloads 60
1 Partial Least Square Regression for High-Dimensional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

This research focuses on the investigation of partial least squares (PLS) methodology to deal with high-dimensional correlated data. Current developments in technology have enabled experiments to produce data that are characterized by, first, the number of variables that far exceeds the number of observations and, second, variables that are substantially correlated between them. These types of data are commonly found in, first, chemometrics, where absorbance levels of chemical samples are recorded across hundreds of wavelengths in a calibration of a near-infrared (NIR) spectrometer. Second, they are also common to be found in genomics where copy number alterations (CNA) are recorded across thousands of genomic regions from cancer patients. In our study, we investigated key areas to address these challenges. Firstly, we tackled the issue of three main PLS algorithms having potentially different interpretations of relevant quantities. We unified these interpretations by identifying scenarios where all three algorithms yield the same estimates. Secondly, we explored the phenomenon of unusual negative shrinkage factors encountered during PLS model fitting. Unlike ridge regression or principal component regression, where shrinkage factors range between zero and one, PLS can exhibit factors greater than one or even negative, hence more aptly termed ‘filter factors’ rather than ‘shrinkage factors’. This characteristic allows PLS to effectively handle high-dimensional data by applying shrinkage to estimates. To our knowledge, there has been no previous meaningful investigation on the negative filter factors (NFF) in PLS. In this research we present a novel result whereby we identify the condition for NFF to happen and investigate characteristics of the data that are associated with NFF to get an insight. Lastly, the main challenge of the application of PLS is in the interpretation of weights associated with the predictors. With hundreds and thousands of predictors, each and every predictor variable has non-zero weight. However, we expect that only some predictor variables are contributing to the association with the outcome variable. We, therefore, resort to the sparse estimation of predictor weights where some weights are zero estimated and the other weights are non-zero. A (standard) lasso estimation has a weakness in dealing with correlated variables as it picks up one variable within a correlation block without knowing the reason. A novel approach is needed to consider the dependencies between predictor variables in estimating the weights. We propose a new method where a new penalty function is introduced in the likelihood function associated with the estimation of weights. The penalty function is a combination of a lasso penalty that imposes sparsity and a penalty based on Cauchy distribution with a smoother matrix to take into account dependencies between genomic regions. The results show that the estimates of the weights are sparse: many weights are zero estimated, and those non-zero estimates are grouped and exhibit smoothness within them. The interpretation of genomic regions becomes easy, and the identification of important regions for each component can be done simultaneously with prediction in a single modeling framework. We investigate the relation between PLS and graphical modeling using the information in the weights to construct the graph with unsuccessful results. High-dimensional data where the number of predictors (p) exceeds the number of observations (n) are widely used in many applications of regression analysis. Ordinary least squares regression (OLS), which is the most well-known method for regression problems, has less performance with high-dimensional and highly- correlated data. Previous studies have shown that there is an association between copy number alterations (CNA) in some key genes and disease phenotypes. Moreover, it is very important in high-dimensional data to classify the samples into groups, such as tumor types, of gene expression data in bioinformatics and biology. However, the standard regression of classification methods will fail in these cases because the predictors matrix is singular and so, cannot be inverted. Hence, regularised methods are needed such as shrinkage methods and dimension reduction methods. One of the most suggested methods in the literature is partial least squares regression (PLS) for linear regression and classification.

Keywords: negative filter factors, partial least square regression, high-dimensional data, biostatistics, bioinformatics

Procedia PDF Downloads 11