Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 25

Search results for: sparsity

25 Sparse Signal Restoration Algorithm Based on Piecewise Adaptive Backtracking Orthogonal Least Squares

Authors: Linyu Wang, Jiahui Ma, Jianhong Xiang, Hanyu Jiang

Abstract:

the traditional greedy compressed sensing algorithm needs to know the signal sparsity when recovering the signal, but the signal sparsity in the practical application can not be obtained as a priori information, and the recovery accuracy is low, which does not meet the needs of practical application. To solve this problem, this paper puts forward Piecewise adaptive backtracking orthogonal least squares algorithm. The algorithm is divided into two stages. In the first stage, the sparsity pre-estimation strategy is adopted, which can quickly approach the real sparsity and reduce time consumption. In the second stage iteration, the correction strategy and adaptive step size are used to accurately estimate the sparsity, and the backtracking idea is introduced to improve the accuracy of signal recovery. Through experimental simulation, the algorithm can accurately recover the estimated signal with fewer iterations when the sparsity is unknown.

Keywords: compressed sensing, greedy algorithm, least square method, adaptive reconstruction

Procedia PDF Downloads 146

24 Sparsity Order Selection and Denoising in Compressed Sensing Framework

Authors: Mahdi Shamsi, Tohid Yousefi Rezaii, Siavash Eftekharifar

Abstract:

Compressed sensing (CS) is a new powerful mathematical theory concentrating on sparse signals which is widely used in signal processing. The main idea is to sense sparse signals by far fewer measurements than the Nyquist sampling rate, but the reconstruction process becomes nonlinear and more complicated. Common dilemma in sparse signal recovery in CS is the lack of knowledge about sparsity order of the signal, which can be viewed as model order selection procedure. In this paper, we address the problem of sparsity order estimation in sparse signal recovery. This is of main interest in situations where the signal sparsity is unknown or the signal to be recovered is approximately sparse. It is shown that the proposed method also leads to some kind of signal denoising, where the observations are contaminated with noise. Finally, the performance of the proposed approach is evaluated in different scenarios and compared to an existing method, which shows the effectiveness of the proposed method in terms of order selection as well as denoising.

Keywords: compressed sensing, data denoising, model order selection, sparse representation

Procedia PDF Downloads 483

23 Sparsity-Based Unsupervised Unmixing of Hyperspectral Imaging Data Using Basis Pursuit

Authors: Ahmed Elrewainy

Abstract:

Mixing in the hyperspectral imaging occurs due to the low spatial resolutions of the used cameras. The existing pure materials “endmembers” in the scene share the spectra pixels with different amounts called “abundances”. Unmixing of the data cube is an important task to know the present endmembers in the cube for the analysis of these images. Unsupervised unmixing is done with no information about the given data cube. Sparsity is one of the recent approaches used in the source recovery or unmixing techniques. The l₁-norm optimization problem “basis pursuit” could be used as a sparsity-based approach to solve this unmixing problem where the endmembers is assumed to be sparse in an appropriate domain known as dictionary. This optimization problem is solved using proximal method “iterative thresholding”. The l₁-norm basis pursuit optimization problem as a sparsity-based unmixing technique was used to unmix real and synthetic hyperspectral data cubes.

Keywords: basis pursuit, blind source separation, hyperspectral imaging, spectral unmixing, wavelets

Procedia PDF Downloads 195

22 Talent-to-Vec: Using Network Graphs to Validate Models with Data Sparsity

Authors: Shaan Khosla, Jon Krohn

Abstract:

In a recruiting context, machine learning models are valuable for recommendations: to predict the best candidates for a vacancy, to match the best vacancies for a candidate, and compile a set of similar candidates for any given candidate. While useful to create these models, validating their accuracy in a recommendation context is difficult due to a sparsity of data. In this report, we use network graph data to generate useful representations for candidates and vacancies. We use candidates and vacancies as network nodes and designate a bi-directional link between them based on the candidate interviewing for the vacancy. After using node2vec, the embeddings are used to construct a validation dataset with a ranked order, which will help validate new recommender systems.

Keywords: AI, machine learning, NLP, recruiting

Procedia PDF Downloads 84

21 Sparse Unmixing of Hyperspectral Data by Exploiting Joint-Sparsity and Rank-Deficiency

Authors: Fanqiang Kong, Chending Bian

Abstract:

In this work, we exploit two assumed properties of the abundances of the observed signatures (endmembers) in order to reconstruct the abundances from hyperspectral data. Joint-sparsity is the first property of the abundances, which assumes the adjacent pixels can be expressed as different linear combinations of same materials. The second property is rank-deficiency where the number of endmembers participating in hyperspectral data is very small compared with the dimensionality of spectral library, which means that the abundances matrix of the endmembers is a low-rank matrix. These assumptions lead to an optimization problem for the sparse unmixing model that requires minimizing a combined l_2,p-norm and nuclear norm. We propose a variable splitting and augmented Lagrangian algorithm to solve the optimization problem. Experimental evaluation carried out on synthetic and real hyperspectral data shows that the proposed method outperforms the state-of-the-art algorithms with a better spectral unmixing accuracy.

Keywords: hyperspectral unmixing, joint-sparse, low-rank representation, abundance estimation

Procedia PDF Downloads 260

20 Building Scalable and Accurate Hybrid Kernel Mapping Recommender

Authors: Hina Iqbal, Mustansar Ali Ghazanfar, Sandor Szedmak

Abstract:

Recommender systems uses artificial intelligence practices for ﬁltering obscure information and can predict if a user likes a specified item. Kernel mapping Recommender systems have been proposed which are accurate and state-of-the-art algorithms and resolve recommender system’s design objectives such as; long tail, cold-start, and sparsity. The aim of research is to propose hybrid framework that can efficiently integrate different versions— namely item-based and user-based KMR— of KMR algorithm. We have proposed various heuristic algorithms that integrate different versions of KMR (into a unified framework) resulting in improved accuracy and elimination of problems associated with conventional recommender system. We have tested our system on publically available movies dataset and benchmark with KMR. The results (in terms of accuracy, precision, recall, F1 measure and ROC metrics) reveal that the proposed algorithm is quite accurate especially under cold-start and sparse scenarios.

Keywords: Kernel Mapping Recommender Systems, hybrid recommender systems, cold start, sparsity, long tail

Procedia PDF Downloads 338

19 Unsupervised Learning of Spatiotemporally Coherent Metrics

Authors: Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann LeCun

Abstract:

Current state-of-the-art classification and detection algorithms rely on supervised training. In this work we study unsupervised feature learning in the context of temporally coherent video data. We focus on feature learning from unlabeled video data, using the assumption that adjacent video frames contain semantically similar information. This assumption is exploited to train a convolutional pooling auto-encoder regularized by slowness and sparsity. We establish a connection between slow feature learning to metric learning and show that the trained encoder can be used to define a more temporally and semantically coherent metric.

Keywords: machine learning, pattern clustering, pooling, classification

Procedia PDF Downloads 455

18 Sparse Principal Component Analysis: A Least Squares Approximation Approach

Authors: Giovanni Merola

Abstract:

Sparse Principal Components Analysis aims to find principal components with few non-zero loadings. We derive such sparse solutions by adding a genuine sparsity requirement to the original Principal Components Analysis (PCA) objective function. This approach differs from others because it preserves PCA's original optimality: uncorrelatedness of the components and least squares approximation of the data. To identify the best subset of non-zero loadings we propose a branch-and-bound search and an iterative elimination algorithm. This last algorithm finds sparse solutions with large loadings and can be run without specifying the cardinality of the loadings and the number of components to compute in advance. We give thorough comparisons with the existing sparse PCA methods and several examples on real datasets.

Keywords: SPCA, uncorrelated components, branch-and-bound, backward elimination

Procedia PDF Downloads 380

17 On Direct Matrix Factored Inversion via Broyden's Updates

Authors: Adel Mohsen

Abstract:

A direct method based on the good Broyden's updates for evaluating the inverse of a nonsingular square matrix of full rank and solving related system of linear algebraic equations is studied. For a matrix A of order n whose LU-decomposition is A = LU, the multiplication count is O (n3). This includes the evaluation of the LU-decompositions of the inverse, the lower triangular decomposition of A as well as a “reduced matrix inverse”. If an explicit value of the inverse is not needed the order reduces to O (n3/2) to compute to compute inv(U) and the reduced inverse. For a symmetric matrix only O (n3/3) operations are required to compute inv(L) and the reduced inverse. An example is presented to demonstrate the capability of using the reduced matrix inverse in treating ill-conditioned systems. Besides the simplicity of Broyden's update, the method provides a mean to exploit the possible sparsity in the matrix and to derive a suitable preconditioner.

Keywords: Broyden's updates, matrix inverse, inverse factorization, solution of linear algebraic equations, ill-conditioned matrices, preconditioning

Procedia PDF Downloads 478

16 Scalable Learning of Tree-Based Models on Sparsely Representable Data

Authors: Fares Hedayatit, Arnauld Joly, Panagiotis Papadimitriou

Abstract:

Many machine learning tasks such as text annotation usually require training over very big datasets, e.g., millions of web documents, that can be represented in a sparse input space. State-of the-art tree-based ensemble algorithms cannot scale to such datasets, since they include operations whose running time is a function of the input space size rather than a function of the non-zero input elements. In this paper, we propose an efficient splitting algorithm to leverage input sparsity within decision tree methods. Our algorithm improves training time over sparse datasets by more than two orders of magnitude and it has been incorporated in the current version of scikit-learn.org, the most popular open source Python machine learning library.

Keywords: big data, sparsely representable data, tree-based models, scalable learning

Procedia PDF Downloads 262

15 Bridging the Data Gap for Sexism Detection in Twitter: A Semi-Supervised Approach

Authors: Adeep Hande, Shubham Agarwal

Abstract:

This paper presents a study on identifying sexism in online texts using various state-of-the-art deep learning models based on BERT. We experimented with different feature sets and model architectures and evaluated their performance using precision, recall, F1 score, and accuracy metrics. We also explored the use of pseudolabeling technique to improve model performance. Our experiments show that the best-performing models were based on BERT, and their multilingual model achieved an F1 score of 0.83. Furthermore, the use of pseudolabeling significantly improved the performance of the BERT-based models, with the best results achieved using the pseudolabeling technique. Our findings suggest that BERT-based models with pseudolabeling hold great promise for identifying sexism in online texts with high accuracy.

Keywords: large language models, semi-supervised learning, sexism detection, data sparsity

Procedia PDF Downloads 69

14 A Quantitative Evaluation of Text Feature Selection Methods

Authors: B. S. Harish, M. B. Revanasiddappa

Abstract:

Due to rapid growth of text documents in digital form, automated text classification has become an important research in the last two decades. The major challenge of text document representations are high dimension, sparsity, volume and semantics. Since the terms are only features that can be found in documents, selection of good terms (features) plays an very important role. In text classification, feature selection is a strategy that can be used to improve classification effectiveness, computational efficiency and accuracy. In this paper, we present a quantitative analysis of most widely used feature selection (FS) methods, viz. Term Frequency-Inverse Document Frequency (tfidf ), Mutual Information (MI), Information Gain (IG), CHISquare (x2), Term Frequency-Relevance Frequency (tfrf ), Term Strength (TS), Ambiguity Measure (AM) and Symbolic Feature Selection (SFS) to classify text documents. We evaluated all the feature selection methods on standard datasets like 20 Newsgroups, 4 University dataset and Reuters-21578.

Keywords: classifiers, feature selection, text classification

Procedia PDF Downloads 458

13 Efficient Ground Targets Detection Using Compressive Sensing in Ground-Based Synthetic-Aperture Radar (SAR) Images

Authors: Gherbi Nabil

Abstract:

Detection of ground targets in SAR radar images is an important area for radar information processing. In the literature, various algorithms have been discussed in this context. However, most of them are of low robustness and accuracy. To this end, we discuss target detection in SAR images based on compressive sensing. Firstly, traditional SAR image target detection algorithms are discussed, and their limitations are highlighted. Secondly, a compressive sensing method is proposed based on the sparsity of SAR images. Next, the detection problem is solved using Multiple Measurements Vector configuration. Furthermore, a robust Alternating Direction Method of Multipliers (ADMM) is developed to solve the optimization problem. Finally, the detection results obtained using raw complex data are presented. Experimental results on real SAR images have verified the effectiveness of the proposed algorithm.

Keywords: compressive sensing, raw complex data, synthetic aperture radar, ADMM

Procedia PDF Downloads 18

12 Sentiment Classification of Documents

Authors: Swarnadip Ghosh

Abstract:

Sentiment Analysis is the process of detecting the contextual polarity of text. In other words, it determines whether a piece of writing is positive, negative or neutral.Sentiment analysis of documents holds great importance in today's world, when numerous information is stored in databases and in the world wide web. An efficient algorithm to illicit such information, would be beneficial for social, economic as well as medical purposes. In this project, we have developed an algorithm to classify a document into positive or negative. Using our algorithm, we obtained a feature set from the data, and classified the documents based on this feature set. It is important to note that, in the classification, we have not used the independence assumption, which is considered by many procedures like the Naive Bayes. This makes the algorithm more general in scope. Moreover, because of the sparsity and high dimensionality of such data, we did not use empirical distribution for estimation, but developed a method by finding degree of close clustering of the data points. We have applied our algorithm on a movie review data set obtained from IMDb and obtained satisfactory results.

Keywords: sentiment, Run's Test, cross validation, higher dimensional pmf estimation

Procedia PDF Downloads 401

11 HR MRI CS Based Image Reconstruction

Authors: Krzysztof Malczewski

Abstract:

Magnetic Resonance Imaging (MRI) reconstruction algorithm using compressed sensing is presented in this paper. It is exhibited that the offered approach improves MR images spatial resolution in circumstances when highly undersampled k-space trajectories are applied. Compressed Sensing (CS) aims at signal and images reconstructing from significantly fewer measurements than were conventionally assumed necessary. Magnetic Resonance Imaging (MRI) is a fundamental medical imaging method struggles with an inherently slow data acquisition process. The use of CS to MRI has the potential for significant scan time reductions, with visible benefits for patients and health care economics. In this study the objective is to combine super-resolution image enhancement algorithm with CS framework benefits to achieve high resolution MR output image. Both methods emphasize on maximizing image sparsity on known sparse transform domain and minimizing fidelity. The presented algorithm considers the cardiac and respiratory movements.

Keywords: super-resolution, MRI, compressed sensing, sparse-sense, image enhancement

Procedia PDF Downloads 429

10 Spherical Harmonic Based Monostatic Anisotropic Point Scatterer Model for RADAR Applications

Authors: Eric Huang, Coleman DeLude, Justin Romberg, Saibal Mukhopadhyay, Madhavan Swaminathan

Abstract:

High performance computing (HPC) based emulators can be used to model the scattering from multiple stationary and moving targets for RADAR applications. These emulators rely on the RADAR Cross Section (RCS) of the targets being available in complex scenarios. Representing the RCS using tables generated from electromagnetic (EM) simulations is often times cumbersome leading to large storage requirement. This paper proposed a spherical harmonic based anisotropic scatterer model to represent the RCS of complex targets. The problem of finding the locations and reflection profiles of all scatterers can be formulated as a linear least square problem with a special sparsity constraint. This paper solves this problem using a modified Orthogonal Matching Pursuit algorithm. The results show that the spherical harmonic based scatterer model can effectively represent the RCS data of complex targets.

Keywords: RADAR, RCS, high performance computing, point scatterer model

Procedia PDF Downloads 190

9 A Transform Domain Function Controlled VSSLMS Algorithm for Sparse System Identification

Authors: Cemil Turan, Mohammad Shukri Salman

Abstract:

The convergence rate of the least-mean-square (LMS) algorithm deteriorates if the input signal to the filter is correlated. In a system identification problem, this convergence rate can be improved if the signal is white and/or if the system is sparse. We recently proposed a sparse transform domain LMS-type algorithm that uses a variable step-size for a sparse system identification. The proposed algorithm provided high performance even if the input signal is highly correlated. In this work, we investigate the performance of the proposed TD-LMS algorithm for a large number of filter tap which is also a critical issue for standard LMS algorithm. Additionally, the optimum value of the most important parameter is calculated for all experiments. Moreover, the convergence analysis of the proposed algorithm is provided. The performance of the proposed algorithm has been compared to different algorithms in a sparse system identification setting of different sparsity levels and different number of filter taps. Simulations have shown that the proposed algorithm has prominent performance compared to the other algorithms.

Keywords: adaptive filtering, sparse system identification, TD-LMS algorithm, VSSLMS algorithm

Procedia PDF Downloads 359

8 Novel Recommender Systems Using Hybrid CF and Social Network Information

Authors: Kyoung-Jae Kim

Abstract:

Collaborative Filtering (CF) is a popular technique for the personalization in the E-commerce domain to reduce information overload. In general, CF provides recommending items list based on other similar users’ preferences from the user-item matrix and predicts the focal user’s preference for particular items by using them. Many recommender systems in real-world use CF techniques because it’s excellent accuracy and robustness. However, it has some limitations including sparsity problems and complex dimensionality in a user-item matrix. In addition, traditional CF does not consider the emotional interaction between users. In this study, we propose recommender systems using social network and singular value decomposition (SVD) to alleviate some limitations. The purpose of this study is to reduce the dimensionality of data set using SVD and to improve the performance of CF by using emotional information from social network data of the focal user. In this study, we test the usability of hybrid CF, SVD and social network information model using the real-world data. The experimental results show that the proposed model outperforms conventional CF models.

Keywords: recommender systems, collaborative filtering, social network information, singular value decomposition

Procedia PDF Downloads 289

7 System Identification in Presence of Outliers

Authors: Chao Yu, Qing-Guo Wang, Dan Zhang

Abstract:

The outlier detection problem for dynamic systems is formulated as a matrix decomposition problem with low-rank, sparse matrices and further recast as a semidefinite programming (SDP) problem. A fast algorithm is presented to solve the resulting problem while keeping the solution matrix structure and it can greatly reduce the computational cost over the standard interior-point method. The computational burden is further reduced by proper construction of subsets of the raw data without violating low rank property of the involved matrix. The proposed method can make exact detection of outliers in case of no or little noise in output observations. In case of significant noise, a novel approach based on under-sampling with averaging is developed to denoise while retaining the saliency of outliers and so-filtered data enables successful outlier detection with the proposed method while the existing filtering methods fail. Use of recovered “clean” data from the proposed method can give much better parameter estimation compared with that based on the raw data.

Keywords: outlier detection, system identification, matrix decomposition, low-rank matrix, sparsity, semidefinite programming, interior-point methods, denoising

Procedia PDF Downloads 306

6 Neural Machine Translation for Low-Resource African Languages: Benchmarking State-of-the-Art Transformer for Wolof

Authors: Cheikh Bamba Dione, Alla Lo, Elhadji Mamadou Nguer, Siley O. Ba

Abstract:

In this paper, we propose two neural machine translation (NMT) systems (French-to-Wolof and Wolof-to-French) based on sequence-to-sequence with attention and transformer architectures. We trained our models on a parallel French-Wolof corpus of about 83k sentence pairs. Because of the low-resource setting, we experimented with advanced methods for handling data sparsity, including subword segmentation, back translation, and the copied corpus method. We evaluate the models using the BLEU score and find that transformer outperforms the classic seq2seq model in all settings, in addition to being less sensitive to noise. In general, the best scores are achieved when training the models on word-level-based units. For subword-level models, using back translation proves to be slightly beneficial in low-resource (WO) to high-resource (FR) language translation for the transformer (but not for the seq2seq) models. A slight improvement can also be observed when injecting copied monolingual text in the target language. Moreover, combining the copied method data with back translation leads to a substantial improvement of the translation quality.

Keywords: backtranslation, low-resource language, neural machine translation, sequence-to-sequence, transformer, Wolof

Procedia PDF Downloads 144

5 PET Image Resolution Enhancement

Authors: Krzysztof Malczewski

Abstract:

PET is widely applied scanning procedure in medical imaging based research. It delivers measurements of functioning in distinct areas of the human brain while the patient is comfortable, conscious and alert. This article presents the new compression sensing based super-resolution algorithm for improving the image resolution in clinical Positron Emission Tomography (PET) scanners. The issue of motion artifacts is well known in Positron Emission Tomography (PET) studies as its side effect. The PET images are being acquired over a limited period of time. As the patients cannot hold breath during the PET data gathering, spatial blurring and motion artefacts are the usual result. These may lead to wrong diagnosis. It is shown that the presented approach improves PET spatial resolution in cases when Compressed Sensing (CS) sequences are used. Compressed Sensing (CS) aims at signal and images reconstructing from significantly fewer measurements than were traditionally thought necessary. The application of CS to PET has the potential for significant scan time reductions, with visible benefits for patients and health care economics. In this study the goal is to combine super-resolution image enhancement algorithm with CS framework to achieve high resolution PET output. Both methods emphasize on maximizing image sparsity on known sparse transform domain and minimizing fidelity.

Keywords: PET, super-resolution, image reconstruction, pattern recognition

Procedia PDF Downloads 369

4 Cirrhosis Mortality Prediction as Classification using Frequent Subgraph Mining

Authors: Abdolghani Ebrahimi, Diego Klabjan, Chenxi Ge, Daniela Ladner, Parker Stride

Abstract:

In this work, we use machine learning and novel data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. To the best of our knowledge, this is the first work to apply modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

Keywords: machine learning, liver cirrhosis, subgraph mining, supervised learning

Procedia PDF Downloads 133

3 Atomic Decomposition Audio Data Compression and Denoising Using Sparse Dictionary Feature Learning

Authors: T. Bryan , V. Kepuska, I. Kostnaic

Abstract:

A method of data compression and denoising is introduced that is based on atomic decomposition of audio data using “basis vectors” that are learned from the audio data itself. The basis vectors are shown to have higher data compression and better signal-to-noise enhancement than the Gabor and gammatone “seed atoms” that were used to generate them. The basis vectors are the input weights of a Sparse AutoEncoder (SAE) that is trained using “envelope samples” of windowed segments of the audio data. The envelope samples are extracted from the audio data by performing atomic decomposition with Gabor or gammatone seed atoms. This process identifies segments of audio data that are locally coherent with the seed atoms. Envelope samples are extracted by identifying locally coherent audio data segments with Gabor or gammatone seed atoms, found by matching pursuit. The envelope samples are formed by taking the kronecker products of the atomic envelopes with the locally coherent data segments. Oracle signal-to-noise ratio (SNR) verses data compression curves are generated for the seed atoms as well as the basis vectors learned from Gabor and gammatone seed atoms. SNR data compression curves are generated for speech signals as well as early American music recordings. The basis vectors are shown to have higher denoising capability for data compression rates ranging from 90% to 99.84% for speech as well as music. Envelope samples are displayed as images by folding the time series into column vectors. This display method is used to compare of the output of the SAE with the envelope samples that produced them. The basis vectors are also displayed as images. Sparsity is shown to play an important role in producing the highest denoising basis vectors.

Keywords: sparse dictionary learning, autoencoder, sparse autoencoder, basis vectors, atomic decomposition, envelope sampling, envelope samples, Gabor, gammatone, matching pursuit

Procedia PDF Downloads 250

2 Partial Least Square Regression for High-Dimentional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

The research focuses on investigating the use of partial least squares (PLS) methodology for addressing challenges associated with high-dimensional correlated data. Recent technological advancements have led to experiments producing data characterized by a large number of variables compared to observations, with substantial inter-variable correlations. Such data patterns are common in chemometrics, where near-infrared (NIR) spectrometer calibrations record chemical absorbance levels across hundreds of wavelengths, and in genomics, where thousands of genomic regions' copy number alterations (CNA) are recorded from cancer patients. PLS serves as a widely used method for analyzing high-dimensional data, functioning as a regression tool in chemometrics and a classification method in genomics. It handles data complexity by creating latent variables (components) from original variables. However, applying PLS can present challenges. The study investigates key areas to address these challenges, including unifying interpretations across three main PLS algorithms and exploring unusual negative shrinkage factors encountered during model fitting. The research presents an alternative approach to addressing the interpretation challenge of predictor weights associated with PLS. Sparse estimation of predictor weights is employed using a penalty function combining a lasso penalty for sparsity and a Cauchy distribution-based penalty to account for variable dependencies. The results demonstrate sparse and grouped weight estimates, aiding interpretation and prediction tasks in genomic data analysis. High-dimensional data scenarios, where predictors outnumber observations, are common in regression analysis applications. Ordinary least squares regression (OLS), the standard method, performs inadequately with high-dimensional and highly correlated data. Copy number alterations (CNA) in key genes have been linked to disease phenotypes, highlighting the importance of accurate classification of gene expression data in bioinformatics and biology using regularized methods like PLS for regression and classification.

Keywords: partial least square regression, genetics data, negative filter factors, high dimensional data, high correlated data

Procedia PDF Downloads 49

1 Event Data Representation Based on Time Stamp for Pedestrian Detection

Authors: Yuta Nakano, Kozo Kajiwara, Atsushi Hori, Takeshi Fujita

Abstract:

In association with the wave of electric vehicles (EV), low energy consumption systems have become more and more important. One of the key technologies to realize low energy consumption is a dynamic vision sensor (DVS), or we can call it an event sensor, neuromorphic vision sensor and so on. This sensor has several features, such as high temporal resolution, which can achieve 1 Mframe/s, and a high dynamic range (120 DB). However, the point that can contribute to low energy consumption the most is its sparsity; to be more specific, this sensor only captures the pixels that have intensity change. In other words, there is no signal in the area that does not have any intensity change. That is to say, this sensor is more energy efficient than conventional sensors such as RGB cameras because we can remove redundant data. On the other side of the advantages, it is difficult to handle the data because the data format is completely different from RGB image; for example, acquired signals are asynchronous and sparse, and each signal is composed of x-y coordinate, polarity (two values: +1 or -1) and time stamp, it does not include intensity such as RGB values. Therefore, as we cannot use existing algorithms straightforwardly, we have to design a new processing algorithm to cope with DVS data. In order to solve difficulties caused by data format differences, most of the prior arts make a frame data and feed it to deep learning such as Convolutional Neural Networks (CNN) for object detection and recognition purposes. However, even though we can feed the data, it is still difficult to achieve good performance due to a lack of intensity information. Although polarity is often used as intensity instead of RGB pixel value, it is apparent that polarity information is not rich enough. Considering this context, we proposed to use the timestamp information as a data representation that is fed to deep learning. Concretely, at first, we also make frame data divided by a certain time period, then give intensity value in response to the timestamp in each frame; for example, a high value is given on a recent signal. We expected that this data representation could capture the features, especially of moving objects, because timestamp represents the movement direction and speed. By using this proposal method, we made our own dataset by DVS fixed on a parked car to develop an application for a surveillance system that can detect persons around the car. We think DVS is one of the ideal sensors for surveillance purposes because this sensor can run for a long time with low energy consumption in a NOT dynamic situation. For comparison purposes, we reproduced state of the art method as a benchmark, which makes frames the same as us and feeds polarity information to CNN. Then, we measured the object detection performances of the benchmark and ours on the same dataset. As a result, our method achieved a maximum of 7 points greater than the benchmark in the F1 score.

Keywords: event camera, dynamic vision sensor, deep learning, data representation, object recognition, low energy consumption

Procedia PDF Downloads 97