Search results for: DNA sequences classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2742

Search results for: DNA sequences classification

2412 A Spatial Hypergraph Based Semi-Supervised Band Selection Method for Hyperspectral Imagery Semantic Interpretation

Authors: Akrem Sellami, Imed Riadh Farah

Abstract:

Hyperspectral imagery (HSI) typically provides a wealth of information captured in a wide range of the electromagnetic spectrum for each pixel in the image. Hence, a pixel in HSI is a high-dimensional vector of intensities with a large spectral range and a high spectral resolution. Therefore, the semantic interpretation is a challenging task of HSI analysis. We focused in this paper on object classification as HSI semantic interpretation. However, HSI classification still faces some issues, among which are the following: The spatial variability of spectral signatures, the high number of spectral bands, and the high cost of true sample labeling. Therefore, the high number of spectral bands and the low number of training samples pose the problem of the curse of dimensionality. In order to resolve this problem, we propose to introduce the process of dimensionality reduction trying to improve the classification of HSI. The presented approach is a semi-supervised band selection method based on spatial hypergraph embedding model to represent higher order relationships with different weights of the spatial neighbors corresponding to the centroid of pixel. This semi-supervised band selection has been developed to select useful bands for object classification. The presented approach is evaluated on AVIRIS and ROSIS HSIs and compared to other dimensionality reduction methods. The experimental results demonstrate the efficacy of our approach compared to many existing dimensionality reduction methods for HSI classification.

Keywords: dimensionality reduction, hyperspectral image, semantic interpretation, spatial hypergraph

Procedia PDF Downloads 306
2411 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 162
2410 Towards a Balancing Medical Database by Using the Least Mean Square Algorithm

Authors: Kamel Belammi, Houria Fatrim

Abstract:

imbalanced data set, a problem often found in real world application, can cause seriously negative effect on classification performance of machine learning algorithms. There have been many attempts at dealing with classification of imbalanced data sets. In medical diagnosis classification, we often face the imbalanced number of data samples between the classes in which there are not enough samples in rare classes. In this paper, we proposed a learning method based on a cost sensitive extension of Least Mean Square (LMS) algorithm that penalizes errors of different samples with different weight and some rules of thumb to determine those weights. After the balancing phase, we applythe different classifiers (support vector machine (SVM), k- nearest neighbor (KNN) and multilayer neuronal networks (MNN)) for balanced data set. We have also compared the obtained results before and after balancing method.

Keywords: multilayer neural networks, k- nearest neighbor, support vector machine, imbalanced medical data, least mean square algorithm, diabetes

Procedia PDF Downloads 532
2409 Enhanced Arabic Semantic Information Retrieval System Based on Arabic Text Classification

Authors: A. Elsehemy, M. Abdeen , T. Nazmy

Abstract:

Since the appearance of the Semantic web, many semantic search techniques and models were proposed to exploit the information in ontology to enhance the traditional keyword-based search. Many advances were made in languages such as English, German, French and Spanish. However, other languages such as Arabic are not fully supported yet. In this paper we present a framework for ontology based information retrieval for Arabic language. Our system consists of four main modules, namely query parser, indexer, search and a ranking module. Our approach includes building a semantic index by linking ontology concepts to documents, including an annotation weight for each link, to be used in ranking the results. We also augmented the framework with an automatic document categorizer, which enhances the overall document ranking. We have built three Arabic domain ontologies: Sports, Economic and Politics as example for the Arabic language. We built a knowledge base that consists of 79 classes and more than 1456 instances. The system is evaluated using the precision and recall metrics. We have done many retrieval operations on a sample of 40,316 documents with a size 320 MB of pure text. The results show that the semantic search enhanced with text classification gives better performance results than the system without classification.

Keywords: Arabic text classification, ontology based retrieval, Arabic semantic web, information retrieval, Arabic ontology

Procedia PDF Downloads 525
2408 Estimating Tree Height and Forest Classification from Multi Temporal Risat-1 HH and HV Polarized Satellite Aperture Radar Interferometric Phase Data

Authors: Saurav Kumar Suman, P. Karthigayani

Abstract:

In this paper the height of the tree is estimated and forest types is classified from the multi temporal RISAT-1 Horizontal-Horizontal (HH) and Horizontal-Vertical (HV) Polarised Satellite Aperture Radar (SAR) data. The novelty of the proposed project is combined use of the Back-scattering Coefficients (Sigma Naught) and the Coherence. It uses Water Cloud Model (WCM). The approaches use two main steps. (a) Extraction of the different forest parameter data from the Product.xml, BAND-META file and from Grid-xxx.txt file come with the HH & HV polarized data from the ISRO (Indian Space Research Centre). These file contains the required parameter during height estimation. (b) Calculation of the Vegetation and Ground Backscattering, Coherence and other Forest Parameters. (c) Classification of Forest Types using the ENVI 5.0 Tool and ROI (Region of Interest) calculation.

Keywords: RISAT-1, classification, forest, SAR data

Procedia PDF Downloads 407
2407 Ensemble-Based SVM Classification Approach for miRNA Prediction

Authors: Sondos M. Hammad, Sherin M. ElGokhy, Mahmoud M. Fahmy, Elsayed A. Sallam

Abstract:

In this paper, an ensemble-based Support Vector Machine (SVM) classification approach is proposed. It is used for miRNA prediction. Three problems, commonly associated with previous approaches, are alleviated. These problems arise due to impose assumptions on the secondary structural of premiRNA, imbalance between the numbers of the laboratory checked miRNAs and the pseudo-hairpins, and finally using a training data set that does not consider all the varieties of samples in different species. We aggregate the predicted outputs of three well-known SVM classifiers; namely, Triplet-SVM, Virgo and Mirident, weighted by their variant features without any structural assumptions. An additional SVM layer is used in aggregating the final output. The proposed approach is trained and then tested with balanced data sets. The results of the proposed approach outperform the three base classifiers. Improved values for the metrics of 88.88% f-score, 92.73% accuracy, 90.64% precision, 96.64% specificity, 87.2% sensitivity, and the area under the ROC curve is 0.91 are achieved.

Keywords: MiRNAs, SVM classification, ensemble algorithm, assumption problem, imbalance data

Procedia PDF Downloads 349
2406 Use of Gaussian-Euclidean Hybrid Function Based Artificial Immune System for Breast Cancer Diagnosis

Authors: Cuneyt Yucelbas, Seral Ozsen, Sule Yucelbas, Gulay Tezel

Abstract:

Due to the fact that there exist only a small number of complex systems in artificial immune system (AIS) that work out nonlinear problems, nonlinear AIS approaches, among the well-known solution techniques, need to be developed. Gaussian function is usually used as similarity estimation in classification problems and pattern recognition. In this study, diagnosis of breast cancer, the second type of the most widespread cancer in women, was performed with different distance calculation functions that euclidean, gaussian and gaussian-euclidean hybrid function in the clonal selection model of classical AIS on Wisconsin Breast Cancer Dataset (WBCD), which was taken from the University of California, Irvine Machine-Learning Repository. We used 3-fold cross validation method to train and test the dataset. According to the results, the maximum test classification accuracy was reported as 97.35% by using of gaussian-euclidean hybrid function for fold-3. Also, mean of test classification accuracies for all of functions were obtained as 94.78%, 94.45% and 95.31% with use of euclidean, gaussian and gaussian-euclidean, respectively. With these results, gaussian-euclidean hybrid function seems to be a potential distance calculation method, and it may be considered as an alternative distance calculation method for hard nonlinear classification problems.

Keywords: artificial immune system, breast cancer diagnosis, Euclidean function, Gaussian function

Procedia PDF Downloads 435
2405 Incorporating Information Gain in Regular Expressions Based Classifiers

Authors: Rosa L. Figueroa, Christopher A. Flores, Qing Zeng-Treitler

Abstract:

A regular expression consists of sequence characters which allow describing a text path. Usually, in clinical research, regular expressions are manually created by programmers together with domain experts. Lately, there have been several efforts to investigate how to generate them automatically. This article presents a text classification algorithm based on regexes. The algorithm named REX was designed, and then, implemented as a simplified method to create regexes to classify Spanish text automatically. In order to classify ambiguous cases, such as, when multiple labels are assigned to a testing example, REX includes an information gain method Two sets of data were used to evaluate the algorithm’s effectiveness in clinical text classification tasks. The results indicate that the regular expression based classifier proposed in this work performs statically better regarding accuracy and F-measure than Support Vector Machine and Naïve Bayes for both datasets.

Keywords: information gain, regular expressions, smith-waterman algorithm, text classification

Procedia PDF Downloads 320
2404 One-Class Classification Approach Using Fukunaga-Koontz Transform and Selective Multiple Kernel Learning

Authors: Abdullah Bal

Abstract:

This paper presents a one-class classification (OCC) technique based on Fukunaga-Koontz Transform (FKT) for binary classification problems. The FKT is originally a powerful tool to feature selection and ordering for two-class problems. To utilize the standard FKT for data domain description problem (i.e., one-class classification), in this paper, a set of non-class samples which exist outside of positive class (target class) describing boundary formed with limited training data has been constructed synthetically. The tunnel-like decision boundary around upper and lower border of target class samples has been designed using statistical properties of feature vectors belonging to the training data. To capture higher order of statistics of data and increase discrimination ability, the proposed method, termed one-class FKT (OC-FKT), has been extended to its nonlinear version via kernel machines and referred as OC-KFKT for short. Multiple kernel learning (MKL) is a favorable family of machine learning such that tries to find an optimal combination of a set of sub-kernels to achieve a better result. However, the discriminative ability of some of the base kernels may be low and the OC-KFKT designed by this type of kernels leads to unsatisfactory classification performance. To address this problem, the quality of sub-kernels should be evaluated, and the weak kernels must be discarded before the final decision making process. MKL/OC-FKT and selective MKL/OC-FKT frameworks have been designed stimulated by ensemble learning (EL) to weight and then select the sub-classifiers using the discriminability and diversities measured by eigenvalue ratios. The eigenvalue ratios have been assessed based on their regions on the FKT subspaces. The comparative experiments, performed on various low and high dimensional data, against state-of-the-art algorithms confirm the effectiveness of our techniques, especially in case of small sample size (SSS) conditions.

Keywords: ensemble methods, fukunaga-koontz transform, kernel-based methods, multiple kernel learning, one-class classification

Procedia PDF Downloads 21
2403 Online Handwritten Character Recognition for South Indian Scripts Using Support Vector Machines

Authors: Steffy Maria Joseph, Abdu Rahiman V, Abdul Hameed K. M.

Abstract:

Online handwritten character recognition is a challenging field in Artificial Intelligence. The classification success rate of current techniques decreases when the dataset involves similarity and complexity in stroke styles, number of strokes and stroke characteristics variations. Malayalam is a complex south indian language spoken by about 35 million people especially in Kerala and Lakshadweep islands. In this paper, we consider the significant feature extraction for the similar stroke styles of Malayalam. This extracted feature set are suitable for the recognition of other handwritten south indian languages like Tamil, Telugu and Kannada. A classification scheme based on support vector machines (SVM) is proposed to improve the accuracy in classification and recognition of online malayalam handwritten characters. SVM Classifiers are the best for real world applications. The contribution of various features towards the accuracy in recognition is analysed. Performance for different kernels of SVM are also studied. A graphical user interface has developed for reading and displaying the character. Different writing styles are taken for each of the 44 alphabets. Various features are extracted and used for classification after the preprocessing of input data samples. Highest recognition accuracy of 97% is obtained experimentally at the best feature combination with polynomial kernel in SVM.

Keywords: SVM, matlab, malayalam, South Indian scripts, onlinehandwritten character recognition

Procedia PDF Downloads 574
2402 A t-SNE and UMAP Based Neural Network Image Classification Algorithm

Authors: Shelby Simpson, William Stanley, Namir Naba, Xiaodi Wang

Abstract:

Both t-SNE and UMAP are brand new state of art tools to predominantly preserve the local structure that is to group neighboring data points together, which indeed provides a very informative visualization of heterogeneity in our data. In this research, we develop a t-SNE and UMAP base neural network image classification algorithm to embed the original dataset to a corresponding low dimensional dataset as a preprocessing step, then use this embedded database as input to our specially designed neural network classifier for image classification. We use the fashion MNIST data set, which is a labeled data set of images of clothing objects in our experiments. t-SNE and UMAP are used for dimensionality reduction of the data set and thus produce low dimensional embeddings. Furthermore, we use the embeddings from t-SNE and UMAP to feed into two neural networks. The accuracy of the models from the two neural networks is then compared to a dense neural network that does not use embedding as an input to show which model can classify the images of clothing objects more accurately.

Keywords: t-SNE, UMAP, fashion MNIST, neural networks

Procedia PDF Downloads 198
2401 Clustering Color Space, Time Interest Points for Moving Objects

Authors: Insaf Bellamine, Hamid Tairi

Abstract:

Detecting moving objects in sequences is an essential step for video analysis. This paper mainly contributes to the Color Space-Time Interest Points (CSTIP) extraction and detection. We propose a new method for detection of moving objects. Two main steps compose the proposed method. First, we suggest to apply the algorithm of the detection of Color Space-Time Interest Points (CSTIP) on both components of the Color Structure-Texture Image Decomposition which is based on a Partial Differential Equation (PDE): a color geometric structure component and a color texture component. A descriptor is associated to each of these points. In a second stage, we address the problem of grouping the points (CSTIP) into clusters. Experiments and comparison to other motion detection methods on challenging sequences show the performance of the proposed method and its utility for video analysis. Experimental results are obtained from very different types of videos, namely sport videos and animation movies.

Keywords: Color Space-Time Interest Points (CSTIP), Color Structure-Texture Image Decomposition, Motion Detection, clustering

Procedia PDF Downloads 378
2400 Autonomous Vehicle Detection and Classification in High Resolution Satellite Imagery

Authors: Ali J. Ghandour, Houssam A. Krayem, Abedelkarim A. Jezzini

Abstract:

High-resolution satellite images and remote sensing can provide global information in a fast way compared to traditional methods of data collection. Under such high resolution, a road is not a thin line anymore. Objects such as cars and trees are easily identifiable. Automatic vehicles enumeration can be considered one of the most important applications in traffic management. In this paper, autonomous vehicle detection and classification approach in highway environment is proposed. This approach consists mainly of three stages: (i) first, a set of preprocessing operations are applied including soil, vegetation, water suppression. (ii) Then, road networks detection and delineation is implemented using built-up area index, followed by several morphological operations. This step plays an important role in increasing the overall detection accuracy since vehicles candidates are objects contained within the road networks only. (iii) Multi-level Otsu segmentation is implemented in the last stage, resulting in vehicle detection and classification, where detected vehicles are classified into cars and trucks. Accuracy assessment analysis is conducted over different study areas to show the great efficiency of the proposed method, especially in highway environment.

Keywords: remote sensing, object identification, vehicle and road extraction, vehicle and road features-based classification

Procedia PDF Downloads 231
2399 Dynamic Distribution Calibration for Improved Few-Shot Image Classification

Authors: Majid Habib Khan, Jinwei Zhao, Xinhong Hei, Liu Jiedong, Rana Shahzad Noor, Muhammad Imran

Abstract:

Deep learning is increasingly employed in image classification, yet the scarcity and high cost of labeled data for training remain a challenge. Limited samples often lead to overfitting due to biased sample distribution. This paper introduces a dynamic distribution calibration method for few-shot learning. Initially, base and new class samples undergo normalization to mitigate disparate feature magnitudes. A pre-trained model then extracts feature vectors from both classes. The method dynamically selects distribution characteristics from base classes (both adjacent and remote) in the embedding space, using a threshold value approach for new class samples. Given the propensity of similar classes to share feature distributions like mean and variance, this research assumes a Gaussian distribution for feature vectors. Subsequently, distributional features of new class samples are calibrated using a corrected hyperparameter, derived from the distribution features of both adjacent and distant base classes. This calibration augments the new class sample set. The technique demonstrates significant improvements, with up to 4% accuracy gains in few-shot classification challenges, as evidenced by tests on miniImagenet and CUB datasets.

Keywords: deep learning, computer vision, image classification, few-shot learning, threshold

Procedia PDF Downloads 66
2398 Facial Pose Classification Using Hilbert Space Filling Curve and Multidimensional Scaling

Authors: Mekamı Hayet, Bounoua Nacer, Benabderrahmane Sidahmed, Taleb Ahmed

Abstract:

Pose estimation is an important task in computer vision. Though the majority of the existing solutions provide good accuracy results, they are often overly complex and computationally expensive. In this perspective, we propose the use of dimensionality reduction techniques to address the problem of facial pose estimation. Firstly, a face image is converted into one-dimensional time series using Hilbert space filling curve, then the approach converts these time series data to a symbolic representation. Furthermore, a distance matrix is calculated between symbolic series of an input learning dataset of images, to generate classifiers of frontal vs. profile face pose. The proposed method is evaluated with three public datasets. Experimental results have shown that our approach is able to achieve a correct classification rate exceeding 97% with K-NN algorithm.

Keywords: machine learning, pattern recognition, facial pose classification, time series

Procedia PDF Downloads 350
2397 A Cross Cultural Study of Jewish and Arab Listeners: Perception of Harmonic Sequences

Authors: Roni Granot

Abstract:

Musical intervals are the building blocks of melody and harmony. Intervals differ in terms of their size, direction, or quality as consonants or dissonants. In Western music, perceptual dissonance is mostly associated with the sensation of beats or periodicity, whereas cognitive dissonance is associated with rules of harmony and voice leading. These two perceptions can be studied separately in musical cultures which include melodic with little or no harmonic structures. In the Arab musical system, there is a number of different quarter- tone intervals creating various combinations of consonant and dissonant intervals. While traditional Arab music includes only melody, today’s Arab pop music includes harmonization of songs, often using typical Western harmonic sequences. Therefore, the Arab population in Israel presents an interesting case which enables us to examine the distinction between perceptual and cognitive dissonance. In the current study, we compared the responses of 34 Jewish Western listeners and 56 Arab listeners to two types of stimuli and their relationships: Harmonic sequences and isolated harmonic intervals (dyads). Harmonic sequences were presented in synthesized piano tones and represented five levels of Harmonic prototypicality (Tonic ending; Tonic ending with half flattened third; Deceptive cadence; Half cadence; and Dissonant unrelated ending) and were rated on 5-point scales of closure and surprise. Here we report only findings related to the harmonic sequences. One-way repeated measures ANOVA with one within subjects factor with five levels (Type of sequence) and one between- subjects factor (Musical background) indicates a main effect of Type of sequence for surprise ratings F (4, 85) = 51 p<.001, and for closure ratings F (4, 78) 9.54 p < .001, no main effect of Background on either surprise or closure ratings, and a marginally significant Type X Background interaction for surprise F (4, 352) = 6.05 p = .069 and closure ratings F (4, 324) 3.89 p < .01). Planned comparisons show that the interaction of Type of sequence X Background center around surprise and closure ratings of the regular versus the half- flattened third tonic and the deceptive versus the half cadence. The half- flattened third tonic is rated as less surprising and as demanding less continuation than the regular tonic by the Arab listeners as compared to the Western listeners. In addition, the half cadence is rated as more surprising but demanding less continuation than the deceptive cadence in the Arab listeners as compared to the Western listeners. Together, our results suggest that despite the vast exposure of Arab listeners to Western harmony, sensitivity to harmonic rules seems to be partial with preference to oriental sonorities such as half flattened third. In addition, the percept of directionality which demands sensitivity to the level on which closure is obtained and which is strongly entrenched in Western harmony, may not be fully integrated into the Arab listeners’ mental harmonic scheme. Results will be discussed in terms of broad differences between Western and Eastern aesthetic ideals.

Keywords: harmony, cross cultural, Arab music, closure

Procedia PDF Downloads 275
2396 COVID-19 Detection from Computed Tomography Images Using UNet Segmentation, Region Extraction, and Classification Pipeline

Authors: Kenan Morani, Esra Kaya Ayana

Abstract:

This study aimed to develop a novel pipeline for COVID-19 detection using a large and rigorously annotated database of computed tomography (CT) images. The pipeline consists of UNet-based segmentation, lung extraction, and a classification part, with the addition of optional slice removal techniques following the segmentation part. In this work, a batch normalization was added to the original UNet model to produce lighter and better localization, which is then utilized to build a full pipeline for COVID-19 diagnosis. To evaluate the effectiveness of the proposed pipeline, various segmentation methods were compared in terms of their performance and complexity. The proposed segmentation method with batch normalization outperformed traditional methods and other alternatives, resulting in a higher dice score on a publicly available dataset. Moreover, at the slice level, the proposed pipeline demonstrated high validation accuracy, indicating the efficiency of predicting 2D slices. At the patient level, the full approach exhibited higher validation accuracy and macro F1 score compared to other alternatives, surpassing the baseline. The classification component of the proposed pipeline utilizes a convolutional neural network (CNN) to make final diagnosis decisions. The COV19-CT-DB dataset, which contains a large number of CT scans with various types of slices and rigorously annotated for COVID-19 detection, was utilized for classification. The proposed pipeline outperformed many other alternatives on the dataset.

Keywords: classification, computed tomography, lung extraction, macro F1 score, UNet segmentation

Procedia PDF Downloads 131
2395 Exploring Multi-Feature Based Action Recognition Using Multi-Dimensional Dynamic Time Warping

Authors: Guoliang Lu, Changhou Lu, Xueyong Li

Abstract:

In action recognition, previous studies have demonstrated the effectiveness of using multiple features to improve the recognition performance. We focus on two practical issues: i) most studies use a direct way of concatenating/accumulating multi features to evaluate the similarity between two actions. This way could be too strong since each kind of feature can include different dimensions, quantities, etc; ii) in many studies, the employed classification methods lack of a flexible and effective mechanism to add new feature(s) into classification. In this paper, we explore an unified scheme based on recently-proposed multi-dimensional dynamic time warping (MD-DTW). Experiments demonstrated the scheme's effectiveness of combining multi-feature and the flexibility of adding new feature(s) to increase the recognition performance. In addition, the explored scheme also provides us an open architecture for using new advanced classification methods in the future to enhance action recognition.

Keywords: action recognition, multi features, dynamic time warping, feature combination

Procedia PDF Downloads 437
2394 Intelligent Transport System: Classification of Traffic Signs Using Deep Neural Networks in Real Time

Authors: Anukriti Kumar, Tanmay Singh, Dinesh Kumar Vishwakarma

Abstract:

Traffic control has been one of the most common and irritating problems since the time automobiles have hit the roads. Problems like traffic congestion have led to a significant time burden around the world and one significant solution to these problems can be the proper implementation of the Intelligent Transport System (ITS). It involves the integration of various tools like smart sensors, artificial intelligence, position technologies and mobile data services to manage traffic flow, reduce congestion and enhance driver's ability to avoid accidents during adverse weather. Road and traffic signs’ recognition is an emerging field of research in ITS. Classification problem of traffic signs needs to be solved as it is a major step in our journey towards building semi-autonomous/autonomous driving systems. The purpose of this work focuses on implementing an approach to solve the problem of traffic sign classification by developing a Convolutional Neural Network (CNN) classifier using the GTSRB (German Traffic Sign Recognition Benchmark) dataset. Rather than using hand-crafted features, our model addresses the concern of exploding huge parameters and data method augmentations. Our model achieved an accuracy of around 97.6% which is comparable to various state-of-the-art architectures.

Keywords: multiclass classification, convolution neural network, OpenCV

Procedia PDF Downloads 176
2393 A Systematic Literature Review on Security and Privacy Design Patterns

Authors: Ebtehal Aljedaani, Maha Aljohani

Abstract:

Privacy and security patterns are both important for developing software that protects users' data and privacy. Privacy patterns are designed to address common privacy problems, such as unauthorized data collection and disclosure. Security patterns are designed to protect software from attack and ensure reliability and trustworthiness. Using privacy and security patterns, software engineers can implement security and privacy by design principles, which means that security and privacy are considered throughout the software development process. These patterns are available to translate "security & privacy-by-design" into practical advice for software engineering. Previous research on privacy and security patterns has typically focused on one category of patterns at a time. This paper aims to bridge this gap by merging the two categories and identifying their similarities and differences. To do this, the authors conducted a systematic literature review of 25 research papers on privacy and security patterns. The papers were analysed based on the category of the pattern, the classification of the pattern, and the security requirements that the pattern addresses. This paper presents the results of a comprehensive review of privacy and security design patterns. The review is intended to help future IT designers understand the relationship between the two types of patterns and how to use them to design secure and privacy-preserving software. The paper provides a clear classification of privacy and security design patterns, along with examples of each type. The authors found that there is only one widely accepted classification of privacy design patterns, while there are several competing classifications of security design patterns. Three types of security design patterns were found to be the most commonly used.

Keywords: design patterns, security, privacy, classification of patterns, security patterns, privacy patterns

Procedia PDF Downloads 132
2392 Diagnosis and Analysis of Automated Liver and Tumor Segmentation on CT

Authors: R. R. Ramsheeja, R. Sreeraj

Abstract:

For view the internal structures of the human body such as liver, brain, kidney etc have a wide range of different modalities for medical images are provided nowadays. Computer Tomography is one of the most significant medical image modalities. In this paper use CT liver images for study the use of automatic computer aided techniques to calculate the volume of the liver tumor. Segmentation method is used for the detection of tumor from the CT scan is proposed. Gaussian filter is used for denoising the liver image and Adaptive Thresholding algorithm is used for segmentation. Multiple Region Of Interest(ROI) based method that may help to characteristic the feature different. It provides a significant impact on classification performance. Due to the characteristic of liver tumor lesion, inherent difficulties appear selective. For a better performance, a novel proposed system is introduced. Multiple ROI based feature selection and classification are performed. In order to obtain of relevant features for Support Vector Machine(SVM) classifier is important for better generalization performance. The proposed system helps to improve the better classification performance, reason in which we can see a significant reduction of features is used. The diagnosis of liver cancer from the computer tomography images is very difficult in nature. Early detection of liver tumor is very helpful to save the human life.

Keywords: computed tomography (CT), multiple region of interest(ROI), feature values, segmentation, SVM classification

Procedia PDF Downloads 509
2391 Molecular Characterization of Grain Storage Proteins in Some Hordeum Species

Authors: Manar Makhoul, Buthainah Alsalamah, Salam Lawand, Hassan Azzam

Abstract:

The major storage proteins in endosperm of 33 cultivated and wild barley genotypes (H.vulgare, H. spontaneum, H. bulbosum, H. murinum, H. marinum) were analyzed to demonstrate the variation in the hordein polypeptides encoded by multigene families in grains. The SDS-PAGE revealed 13 and 17 alleles at the Hor1 and the Hor2 loci respectively, with frequencies from 0.83 to 14 and 0.56 to 13.41% respectively, while seven alleles at the Hor3 locus with frequencies from 3.63 to 30.91% were recognized. The phylogenetic analysis indicated to relevance of the polymorphism in hordein patterns as successful tool in identifying the individual genotypes and discriminating the species according to genome type. We also reported in this research complete nucleotide sequence B-hordein genes of seven wild and cultivated barley genotypes. A 152bp upstream sequence of B-hordein promoter contained a TATA box, CATC box, AAAG motif, N-motif and E-motif. In silico analysis of B-Hordein sequences demonstrated that the coding regions were not interrupted by any intron, and included the complete ORF which varied between 882 and 906 bp, and encoded mature proteins with 293-301 residues characterized by high contents of glutamine (29%), and proline (18%). Comparison of the predicted polypeptide sequences with the published ones suggested that all S-rich prolamins genes are descended from common ancestor. The sequence started at N-terminal with a signal peptide, and then followed directly by two domains; a repetitive one based on the repetition of the repeat unit PQQPFPQQ and C-terminal domain. Also, it was found that positions of the eight cysteine residues were highly conserved in all the B-hordein sequences, but Hordeum bulbosum had additional unpaired one. The phylogenetic tree of B-hordein polypeptide separated the genotypes in distinct seven subgroups. In general, the high homology between B-hordeins and LMW glutenin subunits suggests similar bread-making influences for these B-hordeins.

Keywords: hordeum, phylogenetic tree, sequencing, storage protein

Procedia PDF Downloads 267
2390 An Integrated Lightweight Naïve Bayes Based Webpage Classification Service for Smartphone Browsers

Authors: Mayank Gupta, Siba Prasad Samal, Vasu Kakkirala

Abstract:

The internet world and its priorities have changed considerably in the last decade. Browsing on smart phones has increased manifold and is set to explode much more. Users spent considerable time browsing different websites, that gives a great deal of insight into user’s preferences. Instead of plain information classifying different aspects of browsing like Bookmarks, History, and Download Manager into useful categories would improve and enhance the user’s experience. Most of the classification solutions are server side that involves maintaining server and other heavy resources. It has security constraints and maybe misses on contextual data during classification. On device, classification solves many such problems, but the challenge is to achieve accuracy on classification with resource constraints. This on device classification can be much more useful in personalization, reducing dependency on cloud connectivity and better privacy/security. This approach provides more relevant results as compared to current standalone solutions because it uses content rendered by browser which is customized by the content provider based on user’s profile. This paper proposes a Naive Bayes based lightweight classification engine targeted for a resource constraint devices. Our solution integrates with Web Browser that in turn triggers classification algorithm. Whenever a user browses a webpage, this solution extracts DOM Tree data from the browser’s rendering engine. This DOM data is a dynamic, contextual and secure data that can’t be replicated. This proposal extracts different features of the webpage that runs on an algorithm to classify into multiple categories. Naive Bayes based engine is chosen in this solution for its inherent advantages in using limited resources compared to other classification algorithms like Support Vector Machine, Neural Networks, etc. Naive Bayes classification requires small memory footprint and less computation suitable for smartphone environment. This solution has a feature to partition the model into multiple chunks that in turn will facilitate less usage of memory instead of loading a complete model. Classification of the webpages done through integrated engine is faster, more relevant and energy efficient than other standalone on device solution. This classification engine has been tested on Samsung Z3 Tizen hardware. The Engine is integrated into Tizen Browser that uses Chromium Rendering Engine. For this solution, extensive dataset is sourced from dmoztools.net and cleaned. This cleaned dataset has 227.5K webpages which are divided into 8 generic categories ('education', 'games', 'health', 'entertainment', 'news', 'shopping', 'sports', 'travel'). Our browser integrated solution has resulted in 15% less memory usage (due to partition method) and 24% less power consumption in comparison with standalone solution. This solution considered 70% of the dataset for training the data model and the rest 30% dataset for testing. An average accuracy of ~96.3% is achieved across the above mentioned 8 categories. This engine can be further extended for suggesting Dynamic tags and using the classification for differential uses cases to enhance browsing experience.

Keywords: chromium, lightweight engine, mobile computing, Naive Bayes, Tizen, web browser, webpage classification

Procedia PDF Downloads 163
2389 A Ratio-Weighted Decision Tree Algorithm for Imbalance Dataset Classification

Authors: Doyin Afolabi, Phillip Adewole, Oladipupo Sennaike

Abstract:

Most well-known classifiers, including the decision tree algorithm, can make predictions on balanced datasets efficiently. However, the decision tree algorithm tends to be biased towards imbalanced datasets because of the skewness of the distribution of such datasets. To overcome this problem, this study proposes a weighted decision tree algorithm that aims to remove the bias toward the majority class and prevents the reduction of majority observations in imbalance datasets classification. The proposed weighted decision tree algorithm was tested on three imbalanced datasets- cancer dataset, german credit dataset, and banknote dataset. The specificity, sensitivity, and accuracy metrics were used to evaluate the performance of the proposed decision tree algorithm on the datasets. The evaluation results show that for some of the weights of our proposed decision tree, the specificity, sensitivity, and accuracy metrics gave better results compared to that of the ID3 decision tree and decision tree induced with minority entropy for all three datasets.

Keywords: data mining, decision tree, classification, imbalance dataset

Procedia PDF Downloads 137
2388 Land Cover Remote Sensing Classification Advanced Neural Networks Supervised Learning

Authors: Eiman Kattan

Abstract:

This study aims to evaluate the impact of classifying labelled remote sensing images conventional neural network (CNN) architecture, i.e., AlexNet on different land cover scenarios based on two remotely sensed datasets from different point of views such as the computational time and performance. Thus, a set of experiments were conducted to specify the effectiveness of the selected convolutional neural network using two implementing approaches, named fully trained and fine-tuned. For validation purposes, two remote sensing datasets, AID, and RSSCN7 which are publicly available and have different land covers features were used in the experiments. These datasets have a wide diversity of input data, number of classes, amount of labelled data, and texture patterns. A specifically designed interactive deep learning GPU training platform for image classification (Nvidia Digit) was employed in the experiments. It has shown efficiency in training, validation, and testing. As a result, the fully trained approach has achieved a trivial result for both of the two data sets, AID and RSSCN7 by 73.346% and 71.857% within 24 min, 1 sec and 8 min, 3 sec respectively. However, dramatic improvement of the classification performance using the fine-tuning approach has been recorded by 92.5% and 91% respectively within 24min, 44 secs and 8 min 41 sec respectively. The represented conclusion opens the opportunities for a better classification performance in various applications such as agriculture and crops remote sensing.

Keywords: conventional neural network, remote sensing, land cover, land use

Procedia PDF Downloads 370
2387 Faster, Lighter, More Accurate: A Deep Learning Ensemble for Content Moderation

Authors: Arian Hosseini, Mahmudul Hasan

Abstract:

To address the increasing need for efficient and accurate content moderation, we propose an efficient and lightweight deep classification ensemble structure. Our approach is based on a combination of simple visual features, designed for high-accuracy classification of violent content with low false positives. Our ensemble architecture utilizes a set of lightweight models with narrowed-down color features, and we apply it to both images and videos. We evaluated our approach using a large dataset of explosion and blast contents and compared its performance to popular deep learning models such as ResNet-50. Our evaluation results demonstrate significant improvements in prediction accuracy, while benefiting from 7.64x faster inference and lower computation cost. While our approach is tailored to explosion detection, it can be applied to other similar content moderation and violence detection use cases as well. Based on our experiments, we propose a "think small, think many" philosophy in classification scenarios. We argue that transforming a single, large, monolithic deep model into a verification-based step model ensemble of multiple small, simple, and lightweight models with narrowed-down visual features can possibly lead to predictions with higher accuracy.

Keywords: deep classification, content moderation, ensemble learning, explosion detection, video processing

Procedia PDF Downloads 55
2386 Improve Divers Tracking and Classification in Sonar Images Using Robust Diver Wake Detection Algorithm

Authors: Mohammad Tarek Al Muallim, Ozhan Duzenli, Ceyhun Ilguy

Abstract:

Harbor protection systems are so important. The need for automatic protection systems has increased over the last years. Diver detection active sonar has great significance. It used to detect underwater threats such as divers and autonomous underwater vehicle. To automatically detect such threats the sonar image is processed by algorithms. These algorithms used to detect, track and classify of underwater objects. In this work, divers tracking and classification algorithm is improved be proposing a robust wake detection method. To detect objects the sonar images is normalized then segmented based on fixed threshold. Next, the centroids of the segments are found and clustered based on distance metric. Then to track the objects linear Kalman filter is applied. To reduce effect of noise and creation of false tracks, the Kalman tracker is fine tuned. The tuning is done based on our active sonar specifications. After the tracks are initialed and updated they are subjected to a filtering stage to eliminate the noisy and unstable tracks. Also to eliminate object with a speed out of the diver speed range such as buoys and fast boats. Afterwards the result tracks are subjected to a classification stage to deiced the type of the object been tracked. Here the classification stage is to deice wither if the tracked object is an open circuit diver or a close circuit diver. At the classification stage, a small area around the object is extracted and a novel wake detection method is applied. The morphological features of the object with his wake is extracted. We used support vector machine to find the best classifier. The sonar training images and the test images are collected by ARMELSAN Defense Technologies Company using the portable diver detection sonar ARAS-2023. After applying the algorithm to the test sonar data, we get fine and stable tracks of the divers. The total classification accuracy achieved with the diver type is 97%.

Keywords: harbor protection, diver detection, active sonar, wake detection, diver classification

Procedia PDF Downloads 238
2385 Credit Risk Assessment Using Rule Based Classifiers: A Comparative Study

Authors: Salima Smiti, Ines Gasmi, Makram Soui

Abstract:

Credit risk is the most important issue for financial institutions. Its assessment becomes an important task used to predict defaulter customers and classify customers as good or bad payers. To this objective, numerous techniques have been applied for credit risk assessment. However, to our knowledge, several evaluation techniques are black-box models such as neural networks, SVM, etc. They generate applicants’ classes without any explanation. In this paper, we propose to assess credit risk using rules classification method. Our output is a set of rules which describe and explain the decision. To this end, we will compare seven classification algorithms (JRip, Decision Table, OneR, ZeroR, Fuzzy Rule, PART and Genetic programming (GP)) where the goal is to find the best rules satisfying many criteria: accuracy, sensitivity, and specificity. The obtained results confirm the efficiency of the GP algorithm for German and Australian datasets compared to other rule-based techniques to predict the credit risk.

Keywords: credit risk assessment, classification algorithms, data mining, rule extraction

Procedia PDF Downloads 181
2384 Phylogenetic Relationships of Common Reef Fish Species in Vietnam

Authors: Dang Thuy Binh, Truong Thi Oanh, Le Phan Khanh Hung, Luong thi Tuong Vy

Abstract:

One of the greatest environmental challenges facing Asia is the management and conservation of the marine biodiversity threaten by fisheries overexploitation, pollution, habitat destruction, and climate change. To date, a few molecular taxonomical studies has been conducted on marine fauna in Vietnam. The purpose of this study was to clarify the phylogeny of economic and ecological reef fish species in Vietnam Reef fish species covering Labridae, Scaridae, Nemipteridae, Serranidae, Acanthuridae, Lutjanidae, Lethrinidae, Mullidae, Balistidae, Pseudochromidae, Pinguipedidae, Fistulariidae, Holocentridae, Synodontidae, and Pomacentridae representing 28 genera were collected from South and Center, Vietnam. Combine with Genbank sequences, a phylogenetic tree was constructed based on 16S gene of mitochondrial DNA using maximum parsimony, maximum likelihood, and Bayesian inference approaches. The phylogram showed the well-resolved clades at genus and family level. Perciformes is the major order of reef fish species in Vietnam. The monophyly of Perciformes is not strongly supported as it was clustered in the same clade with Tetraodontiformes syngnathiformes and Beryciformes. Continue sampling of commercial fish species and classification based on morphology and genetics to build DNA barcoding of fish species in Vietnam is really necessary.

Keywords: reef fish, 16s rDNA, Vietnam, phylogeny

Procedia PDF Downloads 438
2383 Robust Pattern Recognition via Correntropy Generalized Orthogonal Matching Pursuit

Authors: Yulong Wang, Yuan Yan Tang, Cuiming Zou, Lina Yang

Abstract:

This paper presents a novel sparse representation method for robust pattern classification. Generalized orthogonal matching pursuit (GOMP) is a recently proposed efficient sparse representation technique. However, GOMP adopts the mean square error (MSE) criterion and assign the same weights to all measurements, including both severely and slightly corrupted ones. To reduce the limitation, we propose an information-theoretic GOMP (ITGOMP) method by exploiting the correntropy induced metric. The results show that ITGOMP can adaptively assign small weights on severely contaminated measurements and large weights on clean ones, respectively. An ITGOMP based classifier is further developed for robust pattern classification. The experiments on public real datasets demonstrate the efficacy of the proposed approach.

Keywords: correntropy induced metric, matching pursuit, pattern classification, sparse representation

Procedia PDF Downloads 355