Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 4870

Search results for: computer virus classification

4390 Use of Gaussian-Euclidean Hybrid Function Based Artificial Immune System for Breast Cancer Diagnosis

Authors: Cuneyt Yucelbas, Seral Ozsen, Sule Yucelbas, Gulay Tezel

Abstract:

Due to the fact that there exist only a small number of complex systems in artificial immune system (AIS) that work out nonlinear problems, nonlinear AIS approaches, among the well-known solution techniques, need to be developed. Gaussian function is usually used as similarity estimation in classification problems and pattern recognition. In this study, diagnosis of breast cancer, the second type of the most widespread cancer in women, was performed with different distance calculation functions that euclidean, gaussian and gaussian-euclidean hybrid function in the clonal selection model of classical AIS on Wisconsin Breast Cancer Dataset (WBCD), which was taken from the University of California, Irvine Machine-Learning Repository. We used 3-fold cross validation method to train and test the dataset. According to the results, the maximum test classification accuracy was reported as 97.35% by using of gaussian-euclidean hybrid function for fold-3. Also, mean of test classification accuracies for all of functions were obtained as 94.78%, 94.45% and 95.31% with use of euclidean, gaussian and gaussian-euclidean, respectively. With these results, gaussian-euclidean hybrid function seems to be a potential distance calculation method, and it may be considered as an alternative distance calculation method for hard nonlinear classification problems.

Keywords: artificial immune system, breast cancer diagnosis, Euclidean function, Gaussian function

Procedia PDF Downloads 421

4389 Incorporating Information Gain in Regular Expressions Based Classifiers

Authors: Rosa L. Figueroa, Christopher A. Flores, Qing Zeng-Treitler

Abstract:

A regular expression consists of sequence characters which allow describing a text path. Usually, in clinical research, regular expressions are manually created by programmers together with domain experts. Lately, there have been several efforts to investigate how to generate them automatically. This article presents a text classification algorithm based on regexes. The algorithm named REX was designed, and then, implemented as a simplified method to create regexes to classify Spanish text automatically. In order to classify ambiguous cases, such as, when multiple labels are assigned to a testing example, REX includes an information gain method Two sets of data were used to evaluate the algorithm’s effectiveness in clinical text classification tasks. The results indicate that the regular expression based classifier proposed in this work performs statically better regarding accuracy and F-measure than Support Vector Machine and Naïve Bayes for both datasets.

Keywords: information gain, regular expressions, smith-waterman algorithm, text classification

Procedia PDF Downloads 301

4388 Online Handwritten Character Recognition for South Indian Scripts Using Support Vector Machines

Authors: Steffy Maria Joseph, Abdu Rahiman V, Abdul Hameed K. M.

Abstract:

Online handwritten character recognition is a challenging field in Artificial Intelligence. The classification success rate of current techniques decreases when the dataset involves similarity and complexity in stroke styles, number of strokes and stroke characteristics variations. Malayalam is a complex south indian language spoken by about 35 million people especially in Kerala and Lakshadweep islands. In this paper, we consider the significant feature extraction for the similar stroke styles of Malayalam. This extracted feature set are suitable for the recognition of other handwritten south indian languages like Tamil, Telugu and Kannada. A classification scheme based on support vector machines (SVM) is proposed to improve the accuracy in classification and recognition of online malayalam handwritten characters. SVM Classifiers are the best for real world applications. The contribution of various features towards the accuracy in recognition is analysed. Performance for different kernels of SVM are also studied. A graphical user interface has developed for reading and displaying the character. Different writing styles are taken for each of the 44 alphabets. Various features are extracted and used for classification after the preprocessing of input data samples. Highest recognition accuracy of 97% is obtained experimentally at the best feature combination with polynomial kernel in SVM.

Keywords: SVM, matlab, malayalam, South Indian scripts, onlinehandwritten character recognition

Procedia PDF Downloads 557

4387 A t-SNE and UMAP Based Neural Network Image Classification Algorithm

Authors: Shelby Simpson, William Stanley, Namir Naba, Xiaodi Wang

Abstract:

Both t-SNE and UMAP are brand new state of art tools to predominantly preserve the local structure that is to group neighboring data points together, which indeed provides a very informative visualization of heterogeneity in our data. In this research, we develop a t-SNE and UMAP base neural network image classification algorithm to embed the original dataset to a corresponding low dimensional dataset as a preprocessing step, then use this embedded database as input to our specially designed neural network classifier for image classification. We use the fashion MNIST data set, which is a labeled data set of images of clothing objects in our experiments. t-SNE and UMAP are used for dimensionality reduction of the data set and thus produce low dimensional embeddings. Furthermore, we use the embeddings from t-SNE and UMAP to feed into two neural networks. The accuracy of the models from the two neural networks is then compared to a dense neural network that does not use embedding as an input to show which model can classify the images of clothing objects more accurately.

Keywords: t-SNE, UMAP, fashion MNIST, neural networks

Procedia PDF Downloads 175

4386 New Isolate of Cucumber Mosaic Virus Infecting Banana

Authors: Abdelsabour G. A. Khaled, Ahmed W. A. Abdalla And Sabry Y. M. Mahmoud

Abstract:

Banana plants showing typical mosaic and yellow stripes on leaves as symptoms were collected from Assiut Governorate in Egypt. The causal agent was identified as Cucumber mosaic virus (CMV) on the basis of symptoms, transmission, serology, transmission electron microscopy and reverse transcription polymerase chain reaction (RT-PCR). Coat protein (CP) gene was amplified using gene specific primers for coat protein (CP), followed by cloning into desired cloning vector for sequencing. In this study the CMV was transmitted into propagation host either by aphid or mechanically. The transmission was confirmed through Direct Antigen Coating Enzyme Linked Immuno Sorbent Assay (DAC-ELISA). Analysis of the 120 deduced amino acid sequence of the coat protein gene revealed that the EG-A strain of CMV shared from 97.50 to 98.33% with those strains belonging to subgroup IA. The cluster analysis grouped the Egyptian isolate with strains Fny and Ri8 belonging sub-group IA. It appears that there occurs a high incidence of CMV infecting banana belonging to IA subgroup in most parts of Egypt.

Keywords: banana, CMV, transmission, CP gene, RT-PCR

Procedia PDF Downloads 325

4385 CRISPR Technology: A Tool in the Potential Cure for COVID-19 Virus

Authors: Chijindu Okpalaoka, Charles Chinedu Onuselogu

Abstract:

COVID-19, humanity's coronavirus disease caused by SARS-CoV-2, was first detected in late 2019 in Wuhan, China. COVID-19 lacked an established conventional pharmaceutical therapy, and as a result, the outbreak quickly became an epidemic affecting the entire World. Only a qPCR assay is reliable for diagnosing COVID-19. Clustered, regularly interspaced short palindromic repeats (CRISPR) technology is being researched for speedy and specific identification of COVID-19, among other therapeutic techniques. Apart from its therapeutic capabilities, the CRISPR technique is being evaluated to develop antiviral therapies; nevertheless, no CRISPR-based medication has been approved for human use to date. Prophylactic antiviral CRISPR in living being cells, a Cas 13-based approach against coronavirus, has been developed. While this method can be evolved into a treatment approach, it may face substantial obstacles in human clinical trials for licensure. This study discussed the potential applications of CRISPR-based techniques for developing a speedy and accurate feasible treatment alternative for the COVID-19 virus.

Keywords: COVID-19, CRISPR technique, Cas13, SARS-CoV-2, prophylactic antiviral

Procedia PDF Downloads 106

4384 Survey Based Data Security Evaluation in Pakistan Financial Institutions against Malicious Attacks

Authors: Naveed Ghani, Samreen Javed

Abstract:

In today’s heterogeneous network environment, there is a growing demand for distrust clients to jointly execute secure network to prevent from malicious attacks as the defining task of propagating malicious code is to locate new targets to attack. Residual risk is always there no matter what solutions are implemented or whet so ever security methodology or standards being adapted. Security is the first and crucial phase in the field of Computer Science. The main aim of the Computer Security is gathering of information with secure network. No one need wonder what all that malware is trying to do: It's trying to steal money through data theft, bank transfers, stolen passwords, or swiped identities. From there, with the help of our survey we learn about the importance of white listing, antimalware programs, security patches, log files, honey pots, and more used in banks for financial data protection but there’s also a need of implementing the IPV6 tunneling with Crypto data transformation according to the requirements of new technology to prevent the organization from new Malware attacks and crafting of its own messages and sending them to the target. In this paper the writer has given the idea of implementing IPV6 Tunneling Secessions on private data transmission from financial organizations whose secrecy needed to be safeguarded.

Keywords: network worms, malware infection propagating malicious code, virus, security, VPN

Procedia PDF Downloads 342

4383 Autonomous Vehicle Detection and Classification in High Resolution Satellite Imagery

Authors: Ali J. Ghandour, Houssam A. Krayem, Abedelkarim A. Jezzini

Abstract:

High-resolution satellite images and remote sensing can provide global information in a fast way compared to traditional methods of data collection. Under such high resolution, a road is not a thin line anymore. Objects such as cars and trees are easily identifiable. Automatic vehicles enumeration can be considered one of the most important applications in traffic management. In this paper, autonomous vehicle detection and classification approach in highway environment is proposed. This approach consists mainly of three stages: (i) first, a set of preprocessing operations are applied including soil, vegetation, water suppression. (ii) Then, road networks detection and delineation is implemented using built-up area index, followed by several morphological operations. This step plays an important role in increasing the overall detection accuracy since vehicles candidates are objects contained within the road networks only. (iii) Multi-level Otsu segmentation is implemented in the last stage, resulting in vehicle detection and classification, where detected vehicles are classified into cars and trucks. Accuracy assessment analysis is conducted over different study areas to show the great efficiency of the proposed method, especially in highway environment.

Keywords: remote sensing, object identification, vehicle and road extraction, vehicle and road features-based classification

Procedia PDF Downloads 214

4382 Trends in the Incidence of Bloodstream Infections in Patients with Hematological Malignancies in the Period 1991–2012

Authors: V. N. Chebotkevich, E. E. Schetinkina, V. V. Burylev, E. I. Kaytandzhan, N. P. Stizhak

Abstract:

Objective: Blood stream infections (BSI) are severe, life-threatening illness for immuno compromised patients with hematological malignancies. We report the trend in blood-stream infections in this group of patients in the period 1991-2013. Methods: A total of 4742 blood samples investigated. All blood cultures were incubated in a continuous monitoring system for 7 days before discarding negative. On signaled positive, organism was identified by conventional methods. The Real-time polymerase chain reaction (PCR) was used for the indication of human herpes virus 6 (HHV-6), Cytomegalovirus (CMV) and Epstein-Barr virus (EBV). Results: Between 1991 and 2001 the incidence of Gram-positive bacteria (Staphylococcus epidermidis, Staphylococcus aureus) being the most common germs isolated (70,9%) were as Gram-negative rods (Escherichia coli, Klebsiella spp., Pseudomonas spp.) – 29,1%. In next decade 2002-2012 the number of Gram-negative bacteria was increased up to 40.2%. It is shown that the incidence of bacteremia was significantly more frequent at the background of detectable Cytomegalovirus and Epstein-Barr virus-specific DNA in blood. Over recent years, an increased frequency of micro mycetes was registered in blood of the patients with hematological malignancies (Candida spp. was predominant). Conclusion: Accurate and timely detection of BSI is important in determining appropriate treatment of infectious complications in patients with hematological malignancies. The isolation of Staphylococcus epidermidis from blood cultures remains a clinical dilemma for physicians and microbiologists. But in many cases this agent is of the clinical significance in immunocompromised patients with hematological malignancies. The role of CMV and EBV in development of bacteremia was demonstrated.

Keywords: infectious complications, blood stream infections, bacteremia, hemoblastosis

Procedia PDF Downloads 331

4381 COVID-19 Detection from Computed Tomography Images Using UNet Segmentation, Region Extraction, and Classification Pipeline

Authors: Kenan Morani, Esra Kaya Ayana

Abstract:

This study aimed to develop a novel pipeline for COVID-19 detection using a large and rigorously annotated database of computed tomography (CT) images. The pipeline consists of UNet-based segmentation, lung extraction, and a classification part, with the addition of optional slice removal techniques following the segmentation part. In this work, a batch normalization was added to the original UNet model to produce lighter and better localization, which is then utilized to build a full pipeline for COVID-19 diagnosis. To evaluate the effectiveness of the proposed pipeline, various segmentation methods were compared in terms of their performance and complexity. The proposed segmentation method with batch normalization outperformed traditional methods and other alternatives, resulting in a higher dice score on a publicly available dataset. Moreover, at the slice level, the proposed pipeline demonstrated high validation accuracy, indicating the efficiency of predicting 2D slices. At the patient level, the full approach exhibited higher validation accuracy and macro F1 score compared to other alternatives, surpassing the baseline. The classification component of the proposed pipeline utilizes a convolutional neural network (CNN) to make final diagnosis decisions. The COV19-CT-DB dataset, which contains a large number of CT scans with various types of slices and rigorously annotated for COVID-19 detection, was utilized for classification. The proposed pipeline outperformed many other alternatives on the dataset.

Keywords: classification, computed tomography, lung extraction, macro F1 score, UNet segmentation

Procedia PDF Downloads 114

4380 Exploring Multi-Feature Based Action Recognition Using Multi-Dimensional Dynamic Time Warping

Authors: Guoliang Lu, Changhou Lu, Xueyong Li

Abstract:

In action recognition, previous studies have demonstrated the effectiveness of using multiple features to improve the recognition performance. We focus on two practical issues: i) most studies use a direct way of concatenating/accumulating multi features to evaluate the similarity between two actions. This way could be too strong since each kind of feature can include different dimensions, quantities, etc; ii) in many studies, the employed classification methods lack of a flexible and effective mechanism to add new feature(s) into classification. In this paper, we explore an unified scheme based on recently-proposed multi-dimensional dynamic time warping (MD-DTW). Experiments demonstrated the scheme's effectiveness of combining multi-feature and the flexibility of adding new feature(s) to increase the recognition performance. In addition, the explored scheme also provides us an open architecture for using new advanced classification methods in the future to enhance action recognition.

Keywords: action recognition, multi features, dynamic time warping, feature combination

Procedia PDF Downloads 424

4379 A Multi-Site Knowledge Attitude and Practice Survey of Ebola Virus Disease (EVD) in Nigeria

Authors: Ilyasu G., Ogoina D., Otu AA, Muhammed FD, Ebenso B., Otokpa D., Rotifa S., Tuduo-Wisdom O., Habib AG

Abstract:

Background: The 2014 Ebola Virus Disease (EVD) outbreak was characterized by fear, misconceptions and irrational behaviors. We conducted a knowledge attitude and practice survey of EVD in Nigeria to inform the institution of effective control measures. Methods: Between July 30th and September 30th 2014, a cross-sectional study on knowledge, attitude and practice (KAP) of Ebola Virus Disease (EVD) was undertaken among adults of the general population and healthcare workers (HCW) in three states of Nigeria, including Kano, Cross River and Bayelsa states. Demographic information and data on KAP were obtained using a self-administered standardized questionnaire. The percentage KAP scores were categorized as good and poor. Independent predictors of good knowledge of EVD were ascertained using a binary logistic regression model. Results: Out of 1035 study participants with a median age of 32 years, 648 (62.6%) were males, 846 (81.7%) had tertiary education and 441 (42.6%) were HCW. There were 218, 239 and 578 respondents from Bayelsa, Cross Rivers, and Kano states, respectively. The overall median percentage KAP scores and interquartile ranges (IQR) were 79.46% (15.07%), 95.0% (33.33%), and 49.95% (37.50%), respectively. Out of the 1035 respondents, 470 (45.4%), 544(52.56%), and 252 (24.35%) had good KAP of EVD defined using 80%, 90%, and 70% score cut-offs, respectively. Independent predictors of good knowledge of EVD were a HCW (Odds Ratio-OR-2.89, 95% Confidence interval-CI of 1.41-5.90), reporting ‘moderate to high fear of EVD’ (OR-2.15, 95% CI-1.47-3.13) and ‘willingness to modify habit’ (OR-1.68, 95% CI-1.23-2.30). Conclusion: Our results reveal suboptimal EVD-related knowledge, attitude and practice among adults in Nigeria. To effectively control future outbreaks of EVD in Nigeria, there is a need to institute public sensitization programs that improve understanding of EVD and address EVD-related myths and misconceptions, especially among the general population.

Keywords: Ebola, health care worker, knowledge, attitude

Procedia PDF Downloads 270

4378 Neuroblastoma in Children and the Potential Involvement of Viruses in Its Pathogenesis

Authors: Ugo Rovigatti

Abstract:

Neuroblastoma (NBL) has epitomized for at least 40 years our understanding of cancer cellular and molecular biology and its potential applications to novel therapeutic strategies. This includes the discovery of the very first oncogene aberrations and tumorigenesis suppression by differentiation in the 80s; the potential role of suppressor genes in the 90s; the relevance of immunotherapy in the millennium first, and the discovery of additional mutations by NGS technology in the millennium second decade. Similar discoveries were achieved in the majority of human cancers, and similar therapeutic interventions were obtained subsequently to NBL discoveries. Unfortunately, targeted therapies suggested by specific mutations (such as MYCN amplification –MNA- present in ¼ or 1/5 of cases) have not elicited therapeutic successes in aggressive NBL, where the prognosis is still dismal. The reasons appear to be linked to Tumor Heterogeneity, which is particularly evident in NBL but also a clear hallmark of aggressive human cancers generally. The new avenue of cancer immunotherapy (CIT) provided new hopes for cancer patients, but we still ignore the cellular or molecular targets. CIT is emblematic of high-risk disease (HR-NBL) since the mentioned GD2 passive immunotherapy is still providing better survival. We recently critically reviewed and evaluated the literature depicting the genomic landscapes of HR-NBL, coming to the qualified conclusion that among hundreds of affected genes, potential targets, or chromosomal sites, none correlated with anti-GD2 sensitivity. A better explanation is provided by the Micro-Foci inducing Virus (MFV) model, which predicts that neuroblasts infection with the MFV, an RNA virus isolated from a cancer-cluster (space-time association) of HR-NBL cases, elicits the appearance of MNA and additional genomic aberrations with mechanisms resembling chromothripsis. Neuroblasts infected with low titers of MFV amplified MYCN up to 100 folds and became highly transformed and malignant, thus causing neuroblastoma in young rat pups of strains SD and Fisher-344 and larger tumor masses in nu/nu mice. An association was discovered with GD2 since this glycosphingolipid is also the receptor for the family of MFV virus (dsRNA viruses). It is concluded that a dsRNA virus, MFV, appears to provide better explicatory mechanisms for the genesis of i) specific genomic aberrations such as MNA; ii) extensive tumor heterogeneity and chromothripsis; iii) the effects of passive immunotherapy with anti-GD2 monoclonals and that this and similar models should be further investigated in both pediatric and adult cancers.

Keywords: neuroblastoma, MYCN, amplification, viruses, GD2

Procedia PDF Downloads 87

4377 Intelligent Transport System: Classification of Traffic Signs Using Deep Neural Networks in Real Time

Authors: Anukriti Kumar, Tanmay Singh, Dinesh Kumar Vishwakarma

Abstract:

Traffic control has been one of the most common and irritating problems since the time automobiles have hit the roads. Problems like traffic congestion have led to a significant time burden around the world and one significant solution to these problems can be the proper implementation of the Intelligent Transport System (ITS). It involves the integration of various tools like smart sensors, artificial intelligence, position technologies and mobile data services to manage traffic flow, reduce congestion and enhance driver's ability to avoid accidents during adverse weather. Road and traffic signs’ recognition is an emerging field of research in ITS. Classification problem of traffic signs needs to be solved as it is a major step in our journey towards building semi-autonomous/autonomous driving systems. The purpose of this work focuses on implementing an approach to solve the problem of traffic sign classification by developing a Convolutional Neural Network (CNN) classifier using the GTSRB (German Traffic Sign Recognition Benchmark) dataset. Rather than using hand-crafted features, our model addresses the concern of exploding huge parameters and data method augmentations. Our model achieved an accuracy of around 97.6% which is comparable to various state-of-the-art architectures.

Keywords: multiclass classification, convolution neural network, OpenCV

Procedia PDF Downloads 157

4376 A Systematic Literature Review on Security and Privacy Design Patterns

Authors: Ebtehal Aljedaani, Maha Aljohani

Abstract:

Privacy and security patterns are both important for developing software that protects users' data and privacy. Privacy patterns are designed to address common privacy problems, such as unauthorized data collection and disclosure. Security patterns are designed to protect software from attack and ensure reliability and trustworthiness. Using privacy and security patterns, software engineers can implement security and privacy by design principles, which means that security and privacy are considered throughout the software development process. These patterns are available to translate "security & privacy-by-design" into practical advice for software engineering. Previous research on privacy and security patterns has typically focused on one category of patterns at a time. This paper aims to bridge this gap by merging the two categories and identifying their similarities and differences. To do this, the authors conducted a systematic literature review of 25 research papers on privacy and security patterns. The papers were analysed based on the category of the pattern, the classification of the pattern, and the security requirements that the pattern addresses. This paper presents the results of a comprehensive review of privacy and security design patterns. The review is intended to help future IT designers understand the relationship between the two types of patterns and how to use them to design secure and privacy-preserving software. The paper provides a clear classification of privacy and security design patterns, along with examples of each type. The authors found that there is only one widely accepted classification of privacy design patterns, while there are several competing classifications of security design patterns. Three types of security design patterns were found to be the most commonly used.

Keywords: design patterns, security, privacy, classification of patterns, security patterns, privacy patterns

Procedia PDF Downloads 106

4375 A Feature Clustering-Based Sequential Selection Approach for Color Texture Classification

Authors: Mohamed Alimoussa, Alice Porebski, Nicolas Vandenbroucke, Rachid Oulad Haj Thami, Sana El Fkihi

Abstract:

Color and texture are highly discriminant visual cues that provide an essential information in many types of images. Color texture representation and classification is therefore one of the most challenging problems in computer vision and image processing applications. Color textures can be represented in different color spaces by using multiple image descriptors which generate a high dimensional set of texture features. In order to reduce the dimensionality of the feature set, feature selection techniques can be used. The goal of feature selection is to find a relevant subset from an original feature space that can improve the accuracy and efficiency of a classification algorithm. Traditionally, feature selection is focused on removing irrelevant features, neglecting the possible redundancy between relevant ones. This is why some feature selection approaches prefer to use feature clustering analysis to aid and guide the search. These techniques can be divided into two categories. i) Feature clustering-based ranking algorithm uses feature clustering as an analysis that comes before feature ranking. Indeed, after dividing the feature set into groups, these approaches perform a feature ranking in order to select the most discriminant feature of each group. ii) Feature clustering-based subset search algorithms can use feature clustering following one of three strategies; as an initial step that comes before the search, binded and combined with the search or as the search alternative and replacement. In this paper, we propose a new feature clustering-based sequential selection approach for the purpose of color texture representation and classification. Our approach is a three step algorithm. First, irrelevant features are removed from the feature set thanks to a class-correlation measure. Then, introducing a new automatic feature clustering algorithm, the feature set is divided into several feature clusters. Finally, a sequential search algorithm, based on a filter model and a separability measure, builds a relevant and non redundant feature subset: at each step, a feature is selected and features of the same cluster are removed and thus not considered thereafter. This allows to significantly speed up the selection process since large number of redundant features are eliminated at each step. The proposed algorithm uses the clustering algorithm binded and combined with the search. Experiments using a combination of two well known texture descriptors, namely Haralick features extracted from Reduced Size Chromatic Co-occurence Matrices (RSCCMs) and features extracted from Local Binary patterns (LBP) image histograms, on five color texture data sets, Outex, NewBarktex, Parquet, Stex and USPtex demonstrate the efficiency of our method compared to seven of the state of the art methods in terms of accuracy and computation time.

Keywords: feature selection, color texture classification, feature clustering, color LBP, chromatic cooccurrence matrix

Procedia PDF Downloads 114

4374 Enhancing Fall Detection Accuracy with a Transfer Learning-Aided Transformer Model Using Computer Vision

Authors: Sheldon McCall, Miao Yu, Liyun Gong, Shigang Yue, Stefanos Kollias

Abstract:

Falls are a significant health concern for older adults globally, and prompt identification is critical to providing necessary healthcare support. Our study proposes a new fall detection method using computer vision based on modern deep learning techniques. Our approach involves training a trans- former model on a large 2D pose dataset for general action recognition, followed by transfer learning. Specifically, we freeze the first few layers of the trained transformer model and train only the last two layers for fall detection. Our experimental results demonstrate that our proposed method outperforms both classical machine learning and deep learning approaches in fall/non-fall classification. Overall, our study suggests that our proposed methodology could be a valuable tool for identifying falls.

Keywords: healthcare, fall detection, transformer, transfer learning

Procedia PDF Downloads 110

4373 An Integrated Lightweight Naïve Bayes Based Webpage Classification Service for Smartphone Browsers

Authors: Mayank Gupta, Siba Prasad Samal, Vasu Kakkirala

Abstract:

The internet world and its priorities have changed considerably in the last decade. Browsing on smart phones has increased manifold and is set to explode much more. Users spent considerable time browsing different websites, that gives a great deal of insight into user’s preferences. Instead of plain information classifying different aspects of browsing like Bookmarks, History, and Download Manager into useful categories would improve and enhance the user’s experience. Most of the classification solutions are server side that involves maintaining server and other heavy resources. It has security constraints and maybe misses on contextual data during classification. On device, classification solves many such problems, but the challenge is to achieve accuracy on classification with resource constraints. This on device classification can be much more useful in personalization, reducing dependency on cloud connectivity and better privacy/security. This approach provides more relevant results as compared to current standalone solutions because it uses content rendered by browser which is customized by the content provider based on user’s profile. This paper proposes a Naive Bayes based lightweight classification engine targeted for a resource constraint devices. Our solution integrates with Web Browser that in turn triggers classification algorithm. Whenever a user browses a webpage, this solution extracts DOM Tree data from the browser’s rendering engine. This DOM data is a dynamic, contextual and secure data that can’t be replicated. This proposal extracts different features of the webpage that runs on an algorithm to classify into multiple categories. Naive Bayes based engine is chosen in this solution for its inherent advantages in using limited resources compared to other classification algorithms like Support Vector Machine, Neural Networks, etc. Naive Bayes classification requires small memory footprint and less computation suitable for smartphone environment. This solution has a feature to partition the model into multiple chunks that in turn will facilitate less usage of memory instead of loading a complete model. Classification of the webpages done through integrated engine is faster, more relevant and energy efficient than other standalone on device solution. This classification engine has been tested on Samsung Z3 Tizen hardware. The Engine is integrated into Tizen Browser that uses Chromium Rendering Engine. For this solution, extensive dataset is sourced from dmoztools.net and cleaned. This cleaned dataset has 227.5K webpages which are divided into 8 generic categories ('education', 'games', 'health', 'entertainment', 'news', 'shopping', 'sports', 'travel'). Our browser integrated solution has resulted in 15% less memory usage (due to partition method) and 24% less power consumption in comparison with standalone solution. This solution considered 70% of the dataset for training the data model and the rest 30% dataset for testing. An average accuracy of ~96.3% is achieved across the above mentioned 8 categories. This engine can be further extended for suggesting Dynamic tags and using the classification for differential uses cases to enhance browsing experience.

Keywords: chromium, lightweight engine, mobile computing, Naive Bayes, Tizen, web browser, webpage classification

Procedia PDF Downloads 144

4372 CRISPR/Cas9 Based Gene Stacking in Plants for Virus Resistance Using Site-Specific Recombinases

Authors: Sabin Aslam, Sultan Habibullah Khan, James G. Thomson, Abhaya M. Dandekar

Abstract:

Losses due to viral diseases are posing a serious threat to crop production. A quick breakdown of resistance to viruses like Cotton Leaf Curl Virus (CLCuV) demands the application of a proficient technology to engineer durable resistance. Gene stacking has recently emerged as a potential approach for integrating multiple genes in crop plants. In the present study, recombinase technology has been used for site-specific gene stacking. A target vector (pG-Rec) was designed for engineering a predetermined specific site in the plant genome whereby genes can be stacked repeatedly. Using Agrobacterium-mediated transformation, the pG-Rec was transformed into Coker-312 along with Nicotiana tabacum L. cv. Xanthi and Nicotiana benthamiana. The transgene analysis of target lines was conducted through junction PCR. The transgene positive target lines were used for further transformations to site-specifically stack two genes of interest using Bxb1 and PhiC31 recombinases. In the first instance, Cas9 driven by multiplex gRNAs (for Rep gene of CLCuV) was site-specifically integrated into the target lines and determined by the junction PCR and real-time PCR. The resulting plants were subsequently used to stack the second gene of interest (AVP3 gene from Arabidopsis for enhancing cotton plant growth). The addition of the genes is simultaneously achieved with the removal of marker genes for recycling with the next round of gene stacking. Consequently, transgenic marker-free plants were produced with two genes stacked at the specific site. These transgenic plants can be potential germplasm to introduce resistance against various strains of cotton leaf curl virus (CLCuV) and abiotic stresses. The results of the research demonstrate gene stacking in crop plants, a technology that can be used to introduce multiple genes sequentially at predefined genomic sites. The current climate change scenario highlights the use of such technologies so that gigantic environmental issues can be tackled by several traits in a single step. After evaluating virus resistance in the resulting plants, the lines can be a primer to initiate stacking of further genes in Cotton for other traits as well as molecular breeding with elite cotton lines.

Keywords: cotton, CRISPR/Cas9, gene stacking, genome editing, recombinases

Procedia PDF Downloads 128

4371 A Ratio-Weighted Decision Tree Algorithm for Imbalance Dataset Classification

Authors: Doyin Afolabi, Phillip Adewole, Oladipupo Sennaike

Abstract:

Most well-known classifiers, including the decision tree algorithm, can make predictions on balanced datasets efficiently. However, the decision tree algorithm tends to be biased towards imbalanced datasets because of the skewness of the distribution of such datasets. To overcome this problem, this study proposes a weighted decision tree algorithm that aims to remove the bias toward the majority class and prevents the reduction of majority observations in imbalance datasets classification. The proposed weighted decision tree algorithm was tested on three imbalanced datasets- cancer dataset, german credit dataset, and banknote dataset. The specificity, sensitivity, and accuracy metrics were used to evaluate the performance of the proposed decision tree algorithm on the datasets. The evaluation results show that for some of the weights of our proposed decision tree, the specificity, sensitivity, and accuracy metrics gave better results compared to that of the ID3 decision tree and decision tree induced with minority entropy for all three datasets.

Keywords: data mining, decision tree, classification, imbalance dataset

Procedia PDF Downloads 108

4370 Land Cover Remote Sensing Classification Advanced Neural Networks Supervised Learning

Authors: Eiman Kattan

Abstract:

This study aims to evaluate the impact of classifying labelled remote sensing images conventional neural network (CNN) architecture, i.e., AlexNet on different land cover scenarios based on two remotely sensed datasets from different point of views such as the computational time and performance. Thus, a set of experiments were conducted to specify the effectiveness of the selected convolutional neural network using two implementing approaches, named fully trained and fine-tuned. For validation purposes, two remote sensing datasets, AID, and RSSCN7 which are publicly available and have different land covers features were used in the experiments. These datasets have a wide diversity of input data, number of classes, amount of labelled data, and texture patterns. A specifically designed interactive deep learning GPU training platform for image classification (Nvidia Digit) was employed in the experiments. It has shown efficiency in training, validation, and testing. As a result, the fully trained approach has achieved a trivial result for both of the two data sets, AID and RSSCN7 by 73.346% and 71.857% within 24 min, 1 sec and 8 min, 3 sec respectively. However, dramatic improvement of the classification performance using the fine-tuning approach has been recorded by 92.5% and 91% respectively within 24min, 44 secs and 8 min 41 sec respectively. The represented conclusion opens the opportunities for a better classification performance in various applications such as agriculture and crops remote sensing.

Keywords: conventional neural network, remote sensing, land cover, land use

Procedia PDF Downloads 347

4369 Faster, Lighter, More Accurate: A Deep Learning Ensemble for Content Moderation

Authors: Arian Hosseini, Mahmudul Hasan

Abstract:

To address the increasing need for efficient and accurate content moderation, we propose an efficient and lightweight deep classification ensemble structure. Our approach is based on a combination of simple visual features, designed for high-accuracy classification of violent content with low false positives. Our ensemble architecture utilizes a set of lightweight models with narrowed-down color features, and we apply it to both images and videos. We evaluated our approach using a large dataset of explosion and blast contents and compared its performance to popular deep learning models such as ResNet-50. Our evaluation results demonstrate significant improvements in prediction accuracy, while benefiting from 7.64x faster inference and lower computation cost. While our approach is tailored to explosion detection, it can be applied to other similar content moderation and violence detection use cases as well. Based on our experiments, we propose a "think small, think many" philosophy in classification scenarios. We argue that transforming a single, large, monolithic deep model into a verification-based step model ensemble of multiple small, simple, and lightweight models with narrowed-down visual features can possibly lead to predictions with higher accuracy.

Keywords: deep classification, content moderation, ensemble learning, explosion detection, video processing

Procedia PDF Downloads 28

4368 Improve Divers Tracking and Classification in Sonar Images Using Robust Diver Wake Detection Algorithm

Authors: Mohammad Tarek Al Muallim, Ozhan Duzenli, Ceyhun Ilguy

Abstract:

Harbor protection systems are so important. The need for automatic protection systems has increased over the last years. Diver detection active sonar has great significance. It used to detect underwater threats such as divers and autonomous underwater vehicle. To automatically detect such threats the sonar image is processed by algorithms. These algorithms used to detect, track and classify of underwater objects. In this work, divers tracking and classification algorithm is improved be proposing a robust wake detection method. To detect objects the sonar images is normalized then segmented based on fixed threshold. Next, the centroids of the segments are found and clustered based on distance metric. Then to track the objects linear Kalman filter is applied. To reduce effect of noise and creation of false tracks, the Kalman tracker is fine tuned. The tuning is done based on our active sonar specifications. After the tracks are initialed and updated they are subjected to a filtering stage to eliminate the noisy and unstable tracks. Also to eliminate object with a speed out of the diver speed range such as buoys and fast boats. Afterwards the result tracks are subjected to a classification stage to deiced the type of the object been tracked. Here the classification stage is to deice wither if the tracked object is an open circuit diver or a close circuit diver. At the classification stage, a small area around the object is extracted and a novel wake detection method is applied. The morphological features of the object with his wake is extracted. We used support vector machine to find the best classifier. The sonar training images and the test images are collected by ARMELSAN Defense Technologies Company using the portable diver detection sonar ARAS-2023. After applying the algorithm to the test sonar data, we get fine and stable tracks of the divers. The total classification accuracy achieved with the diver type is 97%.

Keywords: harbor protection, diver detection, active sonar, wake detection, diver classification

Procedia PDF Downloads 218

4367 Credit Risk Assessment Using Rule Based Classifiers: A Comparative Study

Authors: Salima Smiti, Ines Gasmi, Makram Soui

Abstract:

Credit risk is the most important issue for financial institutions. Its assessment becomes an important task used to predict defaulter customers and classify customers as good or bad payers. To this objective, numerous techniques have been applied for credit risk assessment. However, to our knowledge, several evaluation techniques are black-box models such as neural networks, SVM, etc. They generate applicants’ classes without any explanation. In this paper, we propose to assess credit risk using rules classification method. Our output is a set of rules which describe and explain the decision. To this end, we will compare seven classification algorithms (JRip, Decision Table, OneR, ZeroR, Fuzzy Rule, PART and Genetic programming (GP)) where the goal is to find the best rules satisfying many criteria: accuracy, sensitivity, and specificity. The obtained results confirm the efficiency of the GP algorithm for German and Australian datasets compared to other rule-based techniques to predict the credit risk.

Keywords: credit risk assessment, classification algorithms, data mining, rule extraction

Procedia PDF Downloads 159

4366 The Magnitude and Associated Factors of Immune Hemolytic Anemia among Human Immuno Deficiency Virus Infected Adults Attending University of Gondar Comprehensive Specialized Hospital North West Ethiopia 2021 GC, Cross Sectional Study Design

Authors: Samul Sahile Kebede

Abstract:

Back ground: -Immune hemolytic anemia commonly affects human immune deficiency, infected individuals. Among anemic HIV patients in Africa, the burden of IHA due to autoantibody was ranged from 2.34 to 3.06 due to the drug was 43.4%. IHA due to autoimmune is potentially a fatal complication of HIV, which accompanies the greatest percent from acquired hemolytic anemia. Objective: -The main aim of this study was to determine the magnitude and associated factors of immune hemolytic anemia among human immuno deficiency virus infected adults at the university of Gondar comprehensive specialized hospital north west Ethiopia from March to April 2021. Methods: - An institution-based cross-sectional study was conducted on 358 human immunodeficiency virus-infected adults selected by systematic random sampling at the University of Gondar comprehensive specialized hospital from March to April 2021. Data for socio-demography, dietary and clinical data were collected by structured pretested questionnaire. Five ml of venous blood was drawn from each participant and analyzed by Unicel DHX 800 hematology analyzer, blood film examination, and antihuman globulin test were performed to the diagnosis of immune hemolytic anemia. Data was entered into Epidata version 4.6 and analyzed by STATA version 14. Descriptive statistics were computed and firth penalized logistic regression was used to identify predictors. P value less than 0.005 interpreted as significant. Result; - The overall prevalence of immune hemolytic anemia was 2.8 % (10 of 358 participants). Of these, 5 were males, and 7 were in the 31 to 50 year age group. Among individuals with immune hemolytic anemia, 40 % mild and 60 % moderate anemia. The factors that showed association were family history of anemia (AOR 8.30 at 95% CI 1.56, 44.12), not eating meat (AOR 7.39 at 95% CI 1.25, 45.0), and high viral load 6.94 at 95% CI (1.13, 42.6). Conclusion and recommendation; Immune hemolytic anemia is less frequent condition in human immunodeficiency virus infected adults, and moderate anemia was common in this population. The prevalence was increased with a high viral load, a family history of anemia, and not eating meat. In these patients, early detection and treatment of immune hemolytic anemia is necessary.

Keywords: anemia, hemolytic, immune, auto immune, HIV/AIDS

Procedia PDF Downloads 90

4365 Robust Pattern Recognition via Correntropy Generalized Orthogonal Matching Pursuit

Authors: Yulong Wang, Yuan Yan Tang, Cuiming Zou, Lina Yang

Abstract:

This paper presents a novel sparse representation method for robust pattern classification. Generalized orthogonal matching pursuit (GOMP) is a recently proposed efficient sparse representation technique. However, GOMP adopts the mean square error (MSE) criterion and assign the same weights to all measurements, including both severely and slightly corrupted ones. To reduce the limitation, we propose an information-theoretic GOMP (ITGOMP) method by exploiting the correntropy induced metric. The results show that ITGOMP can adaptively assign small weights on severely contaminated measurements and large weights on clean ones, respectively. An ITGOMP based classifier is further developed for robust pattern classification. The experiments on public real datasets demonstrate the efficacy of the proposed approach.

Keywords: correntropy induced metric, matching pursuit, pattern classification, sparse representation

Procedia PDF Downloads 338

4364 Data Quality Enhancement with String Length Distribution

Authors: Qi Xiu, Hiromu Hota, Yohsuke Ishii, Takuya Oda

Abstract:

Recently, collectable manufacturing data are rapidly increasing. On the other hand, mega recall is getting serious as a social problem. Under such circumstances, there are increasing needs for preventing mega recalls by defect analysis such as root cause analysis and abnormal detection utilizing manufacturing data. However, the time to classify strings in manufacturing data by traditional method is too long to meet requirement of quick defect analysis. Therefore, we present String Length Distribution Classification method (SLDC) to correctly classify strings in a short time. This method learns character features, especially string length distribution from Product ID, Machine ID in BOM and asset list. By applying the proposal to strings in actual manufacturing data, we verified that the classification time of strings can be reduced by 80%. As a result, it can be estimated that the requirement of quick defect analysis can be fulfilled.

Keywords: string classification, data quality, feature selection, probability distribution, string length

Procedia PDF Downloads 305

4363 Continual Learning Using Data Generation for Hyperspectral Remote Sensing Scene Classification

Authors: Samiah Alammari, Nassim Ammour

Abstract:

When providing a massive number of tasks successively to a deep learning process, a good performance of the model requires preserving the previous tasks data to retrain the model for each upcoming classification. Otherwise, the model performs poorly due to the catastrophic forgetting phenomenon. To overcome this shortcoming, we developed a successful continual learning deep model for remote sensing hyperspectral image regions classification. The proposed neural network architecture encapsulates two trainable subnetworks. The first module adapts its weights by minimizing the discrimination error between the land-cover classes during the new task learning, and the second module tries to learn how to replicate the data of the previous tasks by discovering the latent data structure of the new task dataset. We conduct experiments on HSI dataset Indian Pines. The results confirm the capability of the proposed method.

Keywords: continual learning, data reconstruction, remote sensing, hyperspectral image segmentation

Procedia PDF Downloads 230

4362 Comparing the Apparent Error Rate of Gender Specifying from Human Skeletal Remains by Using Classification and Cluster Methods

Authors: Jularat Chumnaul

Abstract:

In forensic science, corpses from various homicides are different; there are both complete and incomplete, depending on causes of death or forms of homicide. For example, some corpses are cut into pieces, some are camouflaged by dumping into the river, some are buried, some are burned to destroy the evidence, and others. If the corpses are incomplete, it can lead to the difficulty of personally identifying because some tissues and bones are destroyed. To specify gender of the corpses from skeletal remains, the most precise method is DNA identification. However, this method is costly and takes longer so that other identification techniques are used instead. The first technique that is widely used is considering the features of bones. In general, an evidence from the corpses such as some pieces of bones, especially the skull and pelvis can be used to identify their gender. To use this technique, forensic scientists are required observation skills in order to classify the difference between male and female bones. Although this technique is uncomplicated, saving time and cost, and the forensic scientists can fairly accurately determine gender by using this technique (apparently an accuracy rate of 90% or more), the crucial disadvantage is there are only some positions of skeleton that can be used to specify gender such as supraorbital ridge, nuchal crest, temporal lobe, mandible, and chin. Therefore, the skeletal remains that will be used have to be complete. The other technique that is widely used for gender specifying in forensic science and archeology is skeletal measurements. The advantage of this method is it can be used in several positions in one piece of bones, and it can be used even if the bones are not complete. In this study, the classification and cluster analysis are applied to this technique, including the Kth Nearest Neighbor Classification, Classification Tree, Ward Linkage Cluster, K-mean Cluster, and Two Step Cluster. The data contains 507 particular individuals and 9 skeletal measurements (diameter measurements), and the performance of five methods are investigated by considering the apparent error rate (APER). The results from this study indicate that the Two Step Cluster and Kth Nearest Neighbor method seem to be suitable to specify gender from human skeletal remains because both yield small apparent error rate of 0.20% and 4.14%, respectively. On the other hand, the Classification Tree, Ward Linkage Cluster, and K-mean Cluster method are not appropriate since they yield large apparent error rate of 10.65%, 10.65%, and 16.37%, respectively. However, there are other ways to evaluate the performance of classification such as an estimate of the error rate using the holdout procedure or misclassification costs, and the difference methods can make the different conclusions.

Keywords: skeletal measurements, classification, cluster, apparent error rate

Procedia PDF Downloads 235

4361 Towards Real-Time Classification of Finger Movement Direction Using Encephalography Independent Components

Authors: Mohamed Mounir Tellache, Hiroyuki Kambara, Yasuharu Koike, Makoto Miyakoshi, Natsue Yoshimura

Abstract:

This study explores the practicality of using electroencephalographic (EEG) independent components to predict eight-direction finger movements in pseudo-real-time. Six healthy participants with individual-head MRI images performed finger movements in eight directions with two different arm configurations. The analysis was performed in two stages. The first stage consisted of using independent component analysis (ICA) to separate the signals representing brain activity from non-brain activity signals and to obtain the unmixing matrix. The resulting independent components (ICs) were checked, and those reflecting brain-activity were selected. Finally, the time series of the selected ICs were used to predict eight finger-movement directions using Sparse Logistic Regression (SLR). The second stage consisted of using the previously obtained unmixing matrix, the selected ICs, and the model obtained by applying SLR to classify a different EEG dataset. This method was applied to two different settings, namely the single-participant level and the group-level. For the single-participant level, the EEG dataset used in the first stage and the EEG dataset used in the second stage originated from the same participant. For the group-level, the EEG datasets used in the first stage were constructed by temporally concatenating each combination without repetition of the EEG datasets of five participants out of six, whereas the EEG dataset used in the second stage originated from the remaining participants. The average test classification results across datasets (mean ± S.D.) were 38.62 ± 8.36% for the single-participant, which was significantly higher than the chance level (12.50 ± 0.01%), and 27.26 ± 4.39% for the group-level which was also significantly higher than the chance level (12.49% ± 0.01%). The classification accuracy within [–45°, 45°] of the true direction is 70.03 ± 8.14% for single-participant and 62.63 ± 6.07% for group-level which may be promising for some real-life applications. Clustering and contribution analyses further revealed the brain regions involved in finger movement and the temporal aspect of their contribution to the classification. These results showed the possibility of using the ICA-based method in combination with other methods to build a real-time system to control prostheses.

Keywords: brain-computer interface, electroencephalography, finger motion decoding, independent component analysis, pseudo real-time motion decoding

Procedia PDF Downloads 125