Search results for: computer virus classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4837

Search results for: computer virus classification

4477 Enhanced Image Representation for Deep Belief Network Classification of Hyperspectral Images

Authors: Khitem Amiri, Mohamed Farah

Abstract:

Image classification is a challenging task and is gaining lots of interest since it helps us to understand the content of images. Recently Deep Learning (DL) based methods gave very interesting results on several benchmarks. For Hyperspectral images (HSI), the application of DL techniques is still challenging due to the scarcity of labeled data and to the curse of dimensionality. Among other approaches, Deep Belief Network (DBN) based approaches gave a fair classification accuracy. In this paper, we address the problem of the curse of dimensionality by reducing the number of bands and replacing the HSI channels by the channels representing radiometric indices. Therefore, instead of using all the HSI bands, we compute the radiometric indices such as NDVI (Normalized Difference Vegetation Index), NDWI (Normalized Difference Water Index), etc, and we use the combination of these indices as input for the Deep Belief Network (DBN) based classification model. Thus, we keep almost all the pertinent spectral information while reducing considerably the size of the image. In order to test our image representation, we applied our method on several HSI datasets including the Indian pines dataset, Jasper Ridge data and it gave comparable results to the state of the art methods while reducing considerably the time of training and testing.

Keywords: hyperspectral images, deep belief network, radiometric indices, image classification

Procedia PDF Downloads 246
4476 Application of Support Vector Machines in Fault Detection and Diagnosis of Power Transmission Lines

Authors: I. A. Farhat, M. Bin Hasan

Abstract:

A developed approach for the protection of power transmission lines using Support Vector Machines (SVM) technique is presented. In this paper, the SVM technique is utilized for the classification and isolation of faults in power transmission lines. Accurate fault classification and location results are obtained for all possible types of short circuit faults. As in distance protection, the approach utilizes the voltage and current post-fault samples as inputs. The main advantage of the method introduced here is that the method could easily be extended to any power transmission line.

Keywords: fault detection, classification, diagnosis, power transmission line protection, support vector machines (SVM)

Procedia PDF Downloads 536
4475 Prevalence of Trichomonas Tenax in Patients with Pulmonary Disease and Watersheds and Its Potential Implications for Pulmonary Virus Infection

Authors: Pei Chi Fang, Wei Chen Lin

Abstract:

Trichomonas tenax is a microaerophilic oral protozoan found in patients with poor oral hygiene. It participates in the inflammatory process of periodontal disease and can potentially be aspirated into the lungs, giving rise to pulmonary trichomoniasis. However, the precise roles of T. tenax in the pulmonary system remain largely unexplored and warrant comprehensive epidemiological investigation. To assess the prevalence of T. tenax infection, we collected bronchoalveolar lavage fluid (BALF) samples from hospitalized patients with lung diseases. A specific nested PCR approach was employed to determine prevalence rates, yielding 21 positive cases out of 61 samples from Ditmanson Medical Foundation Chia-Yi Christian Hospital, and 11 positive cases out of 55 samples from National Cheng Kung University Hospital. Furthermore, there is a critical need for comprehensive data regarding the presence of T. tenax in environmental surface watersheds. In this context, we present findings from investigations in the Yanshuei and Donggang river basins in southern Taiwan, which are crucial sources for public drinking water in the region. In order to elucidate potential implications on pulmonary virus infections, we conducted an analysis of gene expression level changes in H292 cell line after exposure to T. tenax. Our findings revealed significant regulation of multiple virus-related genes, including IFI44L and IFITM3. Ongoing research endeavors are focused on identifying the key components within T. tenax responsible for these observed effects. Crucially, this study lays the groundwork for a preliminary understanding of T. tenax prevalence in patients with pulmonary diseases. It also seeks to establish a meaningful correlation between lung infections and oral hygiene practices, with the ultimate aim of informing distinct treatment and prevention strategies.

Keywords: parasitology, genes, virus, human health, infection, lung

Procedia PDF Downloads 36
4474 Statistical Classification, Downscaling and Uncertainty Assessment for Global Climate Model Outputs

Authors: Queen Suraajini Rajendran, Sai Hung Cheung

Abstract:

Statistical down scaling models are required to connect the global climate model outputs and the local weather variables for climate change impact prediction. For reliable climate change impact studies, the uncertainty associated with the model including natural variability, uncertainty in the climate model(s), down scaling model, model inadequacy and in the predicted results should be quantified appropriately. In this work, a new approach is developed by the authors for statistical classification, statistical down scaling and uncertainty assessment and is applied to Singapore rainfall. It is a robust Bayesian uncertainty analysis methodology and tools based on coupling dependent modeling error with classification and statistical down scaling models in a way that the dependency among modeling errors will impact the results of both classification and statistical down scaling model calibration and uncertainty analysis for future prediction. Singapore data are considered here and the uncertainty and prediction results are obtained. From the results obtained, directions of research for improvement are briefly presented.

Keywords: statistical downscaling, global climate model, climate change, uncertainty

Procedia PDF Downloads 340
4473 Computer Aided Classification of Architectural Distortion in Mammograms Using Texture Features

Authors: Birmohan Singh, V.K.Jain

Abstract:

Computer aided diagnosis systems provide vital opinion to radiologists in the detection of early signs of breast cancer from mammogram images. Masses and microcalcifications, architectural distortions are the major abnormalities. In this paper, a computer aided diagnosis system has been proposed for distinguishing abnormal mammograms with architectural distortion from normal mammogram. Four types of texture features GLCM texture, GLRLM texture, fractal texture and spectral texture features for the regions of suspicion are extracted. Support Vector Machine has been used as classifier in this study. The proposed system yielded an overall sensitivity of 96.47% and accuracy of 96% for the detection of abnormalities with mammogram images collected from Digital Database for Screening Mammography (DDSM) database.

Keywords: architecture distortion, mammograms, GLCM texture features, GLRLM texture features, support vector machine classifier

Procedia PDF Downloads 463
4472 Using Gene Expression Programming in Learning Process of Rough Neural Networks

Authors: Sanaa Rashed Abdallah, Yasser F. Hassan

Abstract:

The paper will introduce an approach where a rough sets, gene expression programming and rough neural networks are used cooperatively for learning and classification support. The Objective of gene expression programming rough neural networks (GEP-RNN) approach is to obtain new classified data with minimum error in training and testing process. Starting point of gene expression programming rough neural networks (GEP-RNN) approach is an information system and the output from this approach is a structure of rough neural networks which is including the weights and thresholds with minimum classification error.

Keywords: rough sets, gene expression programming, rough neural networks, classification

Procedia PDF Downloads 349
4471 A Statistical Approach to Classification of Agricultural Regions

Authors: Hasan Vural

Abstract:

Turkey is a favorable country to produce a great variety of agricultural products because of her different geographic and climatic conditions which have been used to divide the country into four main and seven sub regions. This classification into seven regions traditionally has been used in order to data collection and publication especially related with agricultural production. Afterwards, nine agricultural regions were considered. Recently, the governmental body which is responsible of data collection and dissemination (Turkish Institute of Statistics-TIS) has used 12 classes which include 11 sub regions and Istanbul province. This study aims to evaluate these classification efforts based on the acreage of ten main crops in a ten years time period (1996-2005). The panel data grouped in 11 subregions has been evaluated by cluster and multivariate statistical methods. It was concluded that from the agricultural production point of view, it will be rather meaningful to consider three main and eight sub-agricultural regions throughout the country.

Keywords: agricultural region, factorial analysis, cluster analysis,

Procedia PDF Downloads 380
4470 The Change of Urban Land Use/Cover Using Object Based Approach for Southern Bali

Authors: I. Gusti A. A. Rai Asmiwyati, Robert J. Corner, Ashraf M. Dewan

Abstract:

Change on land use/cover (LULC) dominantly affects spatial structure and function. It can have such impacts by disrupting social culture practice and disturbing physical elements. Thus, it has become essential to understand of the dynamics in time and space of LULC as it can be used as a critical input for developing sustainable LULC. This study was an attempt to map and monitor the LULC change in Bali Indonesia from 2003 to 2013. Using object based classification to improve the accuracy, and change detection, multi temporal land use/cover data were extracted from a set of ASTER satellite image. The overall accuracies of the classification maps of 2003 and 2013 were 86.99% and 80.36%, respectively. Built up area and paddy field were the dominant type of land use/cover in both years. Patch increase dominantly in 2003 illustrated the rapid paddy field fragmentation and the huge occurring transformation. This approach is new for the case of diverse urban features of Bali that has been growing fast and increased the classification accuracy than the manual pixel based classification.

Keywords: land use/cover, urban, Bali, ASTER

Procedia PDF Downloads 510
4469 An Advanced Automated Brain Tumor Diagnostics Approach

Authors: Berkan Ural, Arif Eser, Sinan Apaydin

Abstract:

Medical image processing is generally become a challenging task nowadays. Indeed, processing of brain MRI images is one of the difficult parts of this area. This study proposes a hybrid well-defined approach which is consisted from tumor detection, extraction and analyzing steps. This approach is mainly consisted from a computer aided diagnostics system for identifying and detecting the tumor formation in any region of the brain and this system is commonly used for early prediction of brain tumor using advanced image processing and probabilistic neural network methods, respectively. For this approach, generally, some advanced noise removal functions, image processing methods such as automatic segmentation and morphological operations are used to detect the brain tumor boundaries and to obtain the important feature parameters of the tumor region. All stages of the approach are done specifically with using MATLAB software. Generally, for this approach, firstly tumor is successfully detected and the tumor area is contoured with a specific colored circle by the computer aided diagnostics program. Then, the tumor is segmented and some morphological processes are achieved to increase the visibility of the tumor area. Moreover, while this process continues, the tumor area and important shape based features are also calculated. Finally, with using the probabilistic neural network method and with using some advanced classification steps, tumor area and the type of the tumor are clearly obtained. Also, the future aim of this study is to detect the severity of lesions through classes of brain tumor which is achieved through advanced multi classification and neural network stages and creating a user friendly environment using GUI in MATLAB. In the experimental part of the study, generally, 100 images are used to train the diagnostics system and 100 out of sample images are also used to test and to check the whole results. The preliminary results demonstrate the high classification accuracy for the neural network structure. Finally, according to the results, this situation also motivates us to extend this framework to detect and localize the tumors in the other organs.

Keywords: image processing algorithms, magnetic resonance imaging, neural network, pattern recognition

Procedia PDF Downloads 385
4468 Land Cover Classification System for the Estimation of Carbon Storage in Terrestrial Ecosystems

Authors: Lei Zhang

Abstract:

The carbon cycle greatly influences global change, and the land cover changes contribute to the status and rate of the carbon budget in ecosystems. This paper proposes a land cover classification system for mapping land cover, the national ecological environment assessment, and estimating carbon storage in ecosystems. The classification system consists of basic land cover classes at levels Ⅰ and Ⅱ and auxiliary features at level III. The basic 38 classes characterizing land cover features are derived from 19 criteria referring to composition, structure, pattern, phenology, etc. The basic classes reflect the status of carbon storage in ecosystems. The auxiliary classes at level III complement the attributes of higher levels by 9 criteria. The 5 environmental criteria of temperature, moisture, landform, aspect and slope mainly reflect the potential and intensity of carbon storage in ecosystems. The disturbance of vegetation succession caused by land use type influences the vegetation carbon budget. The other 3 vegetation cover criteria, growth period, and species characteristics further refine the vegetation types. The hierarchical structure of the land cover map (the classes of levels Ⅰ and Ⅱ) is independent of the products of level III, which is helpful for land cover product management and applications. The classification system has been adopted in the Chinese national land cover database for the carbon budget in ecosystems at a 30 m scale.

Keywords: classification system, land cover, ecosystem, carbon storage, object based

Procedia PDF Downloads 31
4467 Controlled Chemotherapy Strategy Applied to HIV Model

Authors: Shohel Ahmed, Md. Abdul Alim, Sumaiya Rahman

Abstract:

Optimal control can be helpful to test and compare different vaccination strategies of a certain disease. The mathematical model of HIV we consider here is a set of ordinary differential equations (ODEs) describing the interactions of CD4+T cells of the immune system with the human immunodeficiency virus (HIV). As an early treatment setting, we investigate an optimal chemotherapy strategy where control represents the percentage of effect the chemotherapy has on the system. The aim is to obtain a new optimal chemotherapeutic strategy where an isoperimetric constraint on the chemotherapy supply plays a crucial role. We outline the steps in formulating an optimal control problem, derive optimality conditions and demonstrate numerical results of an optimal control for the model. Numerical results illustrate how such a constraint alters the optimal vaccination schedule and its effect on cell-virus interactions.

Keywords: chemotherapy of HIV, optimal control involving ODEs, optimality conditions, Pontryagin’s maximum principle

Procedia PDF Downloads 309
4466 From Type-I to Type-II Fuzzy System Modeling for Diagnosis of Hepatitis

Authors: Shahabeddin Sotudian, M. H. Fazel Zarandi, I. B. Turksen

Abstract:

Hepatitis is one of the most common and dangerous diseases that affects humankind, and exposes millions of people to serious health risks every year. Diagnosis of Hepatitis has always been a challenge for physicians. This paper presents an effective method for diagnosis of hepatitis based on interval Type-II fuzzy. This proposed system includes three steps: pre-processing (feature selection), Type-I and Type-II fuzzy classification, and system evaluation. KNN-FD feature selection is used as the preprocessing step in order to exclude irrelevant features and to improve classification performance and efficiency in generating the classification model. In the fuzzy classification step, an “indirect approach” is used for fuzzy system modeling by implementing the exponential compactness and separation index for determining the number of rules in the fuzzy clustering approach. Therefore, we first proposed a Type-I fuzzy system that had an accuracy of approximately 90.9%. In the proposed system, the process of diagnosis faces vagueness and uncertainty in the final decision. Thus, the imprecise knowledge was managed by using interval Type-II fuzzy logic. The results that were obtained show that interval Type-II fuzzy has the ability to diagnose hepatitis with an average accuracy of 93.94%. The classification accuracy obtained is the highest one reached thus far. The aforementioned rate of accuracy demonstrates that the Type-II fuzzy system has a better performance in comparison to Type-I and indicates a higher capability of Type-II fuzzy system for modeling uncertainty.

Keywords: hepatitis disease, medical diagnosis, type-I fuzzy logic, type-II fuzzy logic, feature selection

Procedia PDF Downloads 280
4465 Detection and Distribution Pattern of Prevelant Genotypes of Hepatitis C in a Tertiary Care Hospital of Western India

Authors: Upasana Bhumbla

Abstract:

Background: Hepatitis C virus is a major cause of chronic hepatitis, which can further lead to cirrhosis of the liver and hepatocellular carcinoma. Worldwide the burden of Hepatitis C infection has become a serious threat to the human race. Hepatitis C virus (HCV) has population-specific genotypes and provides valuable epidemiological and therapeutic information. Genotyping and assessment of viral load in HCV patients are important for planning the therapeutic strategies. The aim of the study is to study the changing trends of prevalence and genotypic distribution of hepatitis C virus in a tertiary care hospital in Western India. Methods: It is a retrospective study; blood samples were collected and tested for anti HCV antibodies by ELISA in Dept. of Microbiology. In seropositive Hepatitis C patients, quantification of HCV-RNA was done by real-time PCR and in HCV-RNA positive samples, genotyping was conducted. Results: A total of 114 patients who were seropositive for Anti HCV were recruited in the study, out of which 79 (69.29%) were HCV-RNA positive. Out of these positive samples, 54 were further subjected to genotype determination using real-time PCR. Genotype was not detected in 24 samples due to low viral load; 30 samples were positive for genotype. Conclusion: Knowledge of genotype is crucial for the management of HCV infection and prediction of prognosis. Patients infected with HCV genotype 1 and 4 will have to receive Interferon and Ribavirin for 48 weeks. Patients with these genotypes show a poor sustained viral response when tested 24 weeks after completion of therapy. On the contrary, patients infected with HCV genotype 2 and 3 are reported to have a better response to therapy.

Keywords: hepatocellular, genotype, ribavarin, seropositive

Procedia PDF Downloads 104
4464 A Survey of Skin Cancer Detection and Classification from Skin Lesion Images Using Deep Learning

Authors: Joseph George, Anne Kotteswara Roa

Abstract:

Skin disease is one of the most common and popular kinds of health issues faced by people nowadays. Skin cancer (SC) is one among them, and its detection relies on the skin biopsy outputs and the expertise of the doctors, but it consumes more time and some inaccurate results. At the early stage, skin cancer detection is a challenging task, and it easily spreads to the whole body and leads to an increase in the mortality rate. Skin cancer is curable when it is detected at an early stage. In order to classify correct and accurate skin cancer, the critical task is skin cancer identification and classification, and it is more based on the cancer disease features such as shape, size, color, symmetry and etc. More similar characteristics are present in many skin diseases; hence it makes it a challenging issue to select important features from a skin cancer dataset images. Hence, the skin cancer diagnostic accuracy is improved by requiring an automated skin cancer detection and classification framework; thereby, the human expert’s scarcity is handled. Recently, the deep learning techniques like Convolutional neural network (CNN), Deep belief neural network (DBN), Artificial neural network (ANN), Recurrent neural network (RNN), and Long and short term memory (LSTM) have been widely used for the identification and classification of skin cancers. This survey reviews different DL techniques for skin cancer identification and classification. The performance metrics such as precision, recall, accuracy, sensitivity, specificity, and F-measures are used to evaluate the effectiveness of SC identification using DL techniques. By using these DL techniques, the classification accuracy increases along with the mitigation of computational complexities and time consumption.

Keywords: skin cancer, deep learning, performance measures, accuracy, datasets

Procedia PDF Downloads 98
4463 A Normalized Non-Stationary Wavelet Based Analysis Approach for a Computer Assisted Classification of Laryngoscopic High-Speed Video Recordings

Authors: Mona K. Fehling, Jakob Unger, Dietmar J. Hecker, Bernhard Schick, Joerg Lohscheller

Abstract:

Voice disorders origin from disturbances of the vibration patterns of the two vocal folds located within the human larynx. Consequently, the visual examination of vocal fold vibrations is an integral part within the clinical diagnostic process. For an objective analysis of the vocal fold vibration patterns, the two-dimensional vocal fold dynamics are captured during sustained phonation using an endoscopic high-speed camera. In this work, we present an approach allowing a fully automatic analysis of the high-speed video data including a computerized classification of healthy and pathological voices. The approach bases on a wavelet-based analysis of so-called phonovibrograms (PVG), which are extracted from the high-speed videos and comprise the entire two-dimensional vibration pattern of each vocal fold individually. Using a principal component analysis (PCA) strategy a low-dimensional feature set is computed from each phonovibrogram. From the PCA-space clinically relevant measures can be derived that quantify objectively vibration abnormalities. In the first part of the work it will be shown that, using a machine learning approach, the derived measures are suitable to distinguish automatically between healthy and pathological voices. Within the approach the formation of the PCA-space and consequently the extracted quantitative measures depend on the clinical data, which were used to compute the principle components. Therefore, in the second part of the work we proposed a strategy to achieve a normalization of the PCA-space by registering the PCA-space to a coordinate system using a set of synthetically generated vibration patterns. The results show that owing to the normalization step potential ambiguousness of the parameter space can be eliminated. The normalization further allows a direct comparison of research results, which bases on PCA-spaces obtained from different clinical subjects.

Keywords: Wavelet-based analysis, Multiscale product, normalization, computer assisted classification, high-speed laryngoscopy, vocal fold analysis, phonovibrogram

Procedia PDF Downloads 238
4462 Comparison of Several Diagnostic Methods for Detecting Bovine Viral Diarrhea Virus Infection in Cattle

Authors: Azizollah Khodakaram- Tafti, Ali Mohammadi, Ghasem Farjanikish

Abstract:

Bovine viral diarrhea virus (BVDV) is one of the most important viral pathogens of cattle worldwide caused by Pestivirus genus, Flaviviridae family.The aim of the present study was to comparison several diagnostic methods and determine the prevalence of BVDV infection for the first time in dairy herds of Fars province, Iran. For initial screening, a total of 400 blood samples were randomly collected from 12 industrial dairy herds and analyzed using reverse transcription (RT)-PCR on the buffy coat. In the second step, blood samples and also ear notch biopsies were collected from 100 cattle of infected farms and tested by antigen capture ELISA (ACE), RT-PCR and immunohistochemistry (IHC). The results of nested RT-PCR (outer primers 0I100/1400R and inner primers BD1/BD2) was successful in 16 out of 400 buffy coat samples (4%) as acute infection in initial screening. Also, 8 out of 100 samples (2%) were positive as persistent infection (PI) by all of the diagnostic tests similarly including RT-PCR, ACE and IHC on buffy coat, serum and skin samples, respectively. Immunoreactivity for bovine BVDV antigen as brown, coarsely to finely granular was observed within the cytoplasm of epithelial cells of epidermis and hair follicles and also subcutaneous stromal cells. These findings confirm the importance of monitoring BVDV infection in cattle of this region and suggest detection and elimination of PI calves for controlling and eradication of this disease.

Keywords: antigen capture ELISA, bovine viral diarrhea virus, immunohistochemistry, RT-PCR, cattle

Procedia PDF Downloads 338
4461 Random Subspace Ensemble of CMAC Classifiers

Authors: Somaiyeh Dehghan, Mohammad Reza Kheirkhahan Haghighi

Abstract:

The rapid growth of domains that have data with a large number of features, while the number of samples is limited has caused difficulty in constructing strong classifiers. To reduce the dimensionality of the feature space becomes an essential step in classification task. Random subspace method (or attribute bagging) is an ensemble classifier that consists of several classifiers that each base learner in ensemble has subset of features. In the present paper, we introduce Random Subspace Ensemble of CMAC neural network (RSE-CMAC), each of which has training with subset of features. Then we use this model for classification task. For evaluation performance of our model, we compare it with bagging algorithm on 36 UCI datasets. The results reveal that the new model has better performance.

Keywords: classification, random subspace, ensemble, CMAC neural network

Procedia PDF Downloads 305
4460 Application of Remote Sensing and GIS in Assessing Land Cover Changes within Granite Quarries around Brits Area, South Africa

Authors: Refilwe Moeletsi

Abstract:

Dimension stone quarrying around Brits and Belfast areas started in the early 1930s and has been growing rapidly since then. Environmental impacts associated with these quarries have not been documented, and hence this study aims at detecting any change in the environment that might have been caused by these activities. Landsat images that were used to assess land use/land cover changes in Brits quarries from 1998 - 2015. A supervised classification using maximum likelihood classifier was applied to classify each image into different land use/land cover types. Classification accuracy was assessed using Google Earth™ as a source of reference data. Post-classification change detection method was used to determine changes. The results revealed significant increase in granite quarries and corresponding decrease in vegetation cover within the study region.

Keywords: remote sensing, GIS, change detection, granite quarries

Procedia PDF Downloads 284
4459 Molecular Detection of Acute Virus Infection in Children Hospitalized with Diarrhea in North India during 2014-2016

Authors: Ali Ilter Akdag, Pratima Ray

Abstract:

Background:This acute gastroenteritis viruses such as rotavirus, astrovirus, and adenovirus are mainly responsible for diarrhea in children below < 5 years old. Molecular detection of these viruses is crucially important to the understand development of the effective cure. This study aimed to determine the prevalence of common these viruses in children < 5 years old presented with diarrhea from Lala Lajpat Rai Memorial Medical College (LLRM) centre (Meerut) North India, India Methods: Total 312 fecal samples were collected from diarrheal children duration 3 years: in year 2014 (n = 118), 2015 (n = 128) and 2016 (n = 66) ,< 5 years of age who presented with acute diarrhea at the Lala Lajpat Rai Memorial Medical College (LLRM) centre(Meerut) North India, India. All samples were the first detection by EIA/RT-PCR for rotaviruses, adenovirus and astrovirus. Results: In 312 samples from children with acute diarrhea in sample viral agent was found, rotavirus A was the most frequent virus identified (57 cases; 18.2%), followed by Astrovirus in 28 cases (8.9%), adenovirus in 21 cases (6.7%). Mixed infections were found in 14 cases, all of which presented with acute diarrhea (14/312; 4.48%). Conclusions: These viruses are a major cause of diarrhea in children <5 years old in North India. Rotavirus A is the most common etiological agent, follow by astrovirus. This surveillance is important to vaccine development of the entire population. There is variation detection of virus year wise due to differences in the season of sampling, method of sampling, hygiene condition, socioeconomic level of the entire people, enrolment criteria, and virus detection methods. It was found Astrovirus higher then Rotavirus in 2015, but overall three years study Rotavirus A is mainly responsible for causing severe diarrhea in children <5 years old in North India. It emphasizes the required for cost-effective diagnostic assays for Rotaviruses which would help to determine the disease burden.

Keywords: adenovirus, Astrovirus, hospitalized children, Rotavirus

Procedia PDF Downloads 114
4458 Hyperspectral Data Classification Algorithm Based on the Deep Belief and Self-Organizing Neural Network

Authors: Li Qingjian, Li Ke, He Chun, Huang Yong

Abstract:

In this paper, the method of combining the Pohl Seidman's deep belief network with the self-organizing neural network is proposed to classify the target. This method is mainly aimed at the high nonlinearity of the hyperspectral image, the high sample dimension and the difficulty in designing the classifier. The main feature of original data is extracted by deep belief network. In the process of extracting features, adding known labels samples to fine tune the network, enriching the main characteristics. Then, the extracted feature vectors are classified into the self-organizing neural network. This method can effectively reduce the dimensions of data in the spectrum dimension in the preservation of large amounts of raw data information, to solve the traditional clustering and the long training time when labeled samples less deep learning algorithm for training problems, improve the classification accuracy and robustness. Through the data simulation, the results show that the proposed network structure can get a higher classification precision in the case of a small number of known label samples.

Keywords: DBN, SOM, pattern classification, hyperspectral, data compression

Procedia PDF Downloads 314
4457 Automatic Method for Classification of Informative and Noninformative Images in Colonoscopy Video

Authors: Nidhal K. Azawi, John M. Gauch

Abstract:

Colorectal cancer is one of the leading causes of cancer death in the US and the world, which is why millions of colonoscopy examinations are performed annually. Unfortunately, noise, specular highlights, and motion artifacts corrupt many images in a typical colonoscopy exam. The goal of our research is to produce automated techniques to detect and correct or remove these noninformative images from colonoscopy videos, so physicians can focus their attention on informative images. In this research, we first automatically extract features from images. Then we use machine learning and deep neural network to classify colonoscopy images as either informative or noninformative. Our results show that we achieve image classification accuracy between 92-98%. We also show how the removal of noninformative images together with image alignment can aid in the creation of image panoramas and other visualizations of colonoscopy images.

Keywords: colonoscopy classification, feature extraction, image alignment, machine learning

Procedia PDF Downloads 228
4456 Predicting Groundwater Areas Using Data Mining Techniques: Groundwater in Jordan as Case Study

Authors: Faisal Aburub, Wael Hadi

Abstract:

Data mining is the process of extracting useful or hidden information from a large database. Extracted information can be used to discover relationships among features, where data objects are grouped according to logical relationships; or to predict unseen objects to one of the predefined groups. In this paper, we aim to investigate four well-known data mining algorithms in order to predict groundwater areas in Jordan. These algorithms are Support Vector Machines (SVMs), Naïve Bayes (NB), K-Nearest Neighbor (kNN) and Classification Based on Association Rule (CBA). The experimental results indicate that the SVMs algorithm outperformed other algorithms in terms of classification accuracy, precision and F1 evaluation measures using the datasets of groundwater areas that were collected from Jordanian Ministry of Water and Irrigation.

Keywords: classification, data mining, evaluation measures, groundwater

Procedia PDF Downloads 252
4455 Spatio-Temporal Assessment of Urban Growth and Land Use Change in Islamabad Using Object-Based Classification Method

Authors: Rabia Shabbir, Sheikh Saeed Ahmad, Amna Butt

Abstract:

Rapid land use changes have taken place in Islamabad, the capital city of Pakistan, over the past decades due to accelerated urbanization and industrialization. In this study, land use changes in the metropolitan area of Islamabad was observed by the combined use of GIS and satellite remote sensing for a time period of 15 years. High-resolution Google Earth images were downloaded from 2000-2015, and object-based classification method was used for accurate classification using eCognition software. The information regarding urban settlements, industrial area, barren land, agricultural area, vegetation, water, and transportation infrastructure was extracted. The results showed that the city experienced a spatial expansion, rapid urban growth, land use change and expanding transportation infrastructure. The study concluded the integration of GIS and remote sensing as an effective approach for analyzing the spatial pattern of urban growth and land use change.

Keywords: land use change, urban growth, Islamabad, object-based classification, Google Earth, remote sensing, GIS

Procedia PDF Downloads 130
4454 Analyzing Tools and Techniques for Classification In Educational Data Mining: A Survey

Authors: D. I. George Amalarethinam, A. Emima

Abstract:

Educational Data Mining (EDM) is one of the newest topics to emerge in recent years, and it is concerned with developing methods for analyzing various types of data gathered from the educational circle. EDM methods and techniques with machine learning algorithms are used to extract meaningful and usable information from huge databases. For scientists and researchers, realistic applications of Machine Learning in the EDM sectors offer new frontiers and present new problems. One of the most important research areas in EDM is predicting student success. The prediction algorithms and techniques must be developed to forecast students' performance, which aids the tutor, institution to boost the level of student’s performance. This paper examines various classification techniques in prediction methods and data mining tools used in EDM.

Keywords: classification technique, data mining, EDM methods, prediction methods

Procedia PDF Downloads 99
4453 Morphological Processing of Punjabi Text for Sentiment Analysis of Farmer Suicides

Authors: Jaspreet Singh, Gurvinder Singh, Prabhsimran Singh, Rajinder Singh, Prithvipal Singh, Karanjeet Singh Kahlon, Ravinder Singh Sawhney

Abstract:

Morphological evaluation of Indian languages is one of the burgeoning fields in the area of Natural Language Processing (NLP). The evaluation of a language is an eminent task in the era of information retrieval and text mining. The extraction and classification of knowledge from text can be exploited for sentiment analysis and morphological evaluation. This study coalesce morphological evaluation and sentiment analysis for the task of classification of farmer suicide cases reported in Punjab state of India. The pre-processing of Punjabi text involves morphological evaluation and normalization of Punjabi word tokens followed by the training of proposed model using deep learning classification on Punjabi language text extracted from online Punjabi news reports. The class-wise accuracies of sentiment prediction for four negatively oriented classes of farmer suicide cases are 93.85%, 88.53%, 83.3%, and 95.45% respectively. The overall accuracy of sentiment classification obtained using proposed framework on 275 Punjabi text documents is found to be 90.29%.

Keywords: deep neural network, farmer suicides, morphological processing, punjabi text, sentiment analysis

Procedia PDF Downloads 290
4452 A Nonlinear Feature Selection Method for Hyperspectral Image Classification

Authors: Pei-Jyun Hsieh, Cheng-Hsuan Li, Bor-Chen Kuo

Abstract:

For hyperspectral image classification, feature reduction is an important pre-processing for avoiding the Hughes phenomena due to the difficulty for collecting training samples. Hence, lots of researches developed feature selection methods such as F-score, HSIC (Hilbert-Schmidt Independence Criterion), and etc., to improve hyperspectral image classification. However, most of them only consider the class separability in the original space, i.e., a linear class separability. In this study, we proposed a nonlinear class separability measure based on kernel trick for selecting an appropriate feature subset. The proposed nonlinear class separability was formed by a generalized RBF kernel with different bandwidths with respect to different features. Moreover, it considered the within-class separability and the between-class separability. A genetic algorithm was applied to tune these bandwidths such that the smallest with-class separability and the largest between-class separability simultaneously. This indicates the corresponding feature space is more suitable for classification. In addition, the corresponding nonlinear classification boundary can separate classes very well. These optimal bandwidths also show the importance of bands for hyperspectral image classification. The reciprocals of these bandwidths can be viewed as weights of bands. The smaller bandwidth, the larger weight of the band, and the more importance for classification. Hence, the descending order of the reciprocals of the bands gives an order for selecting the appropriate feature subsets. In the experiments, three hyperspectral image data sets, the Indian Pine Site data set, the PAVIA data set, and the Salinas A data set, were used to demonstrate the selected feature subsets by the proposed nonlinear feature selection method are more appropriate for hyperspectral image classification. Only ten percent of samples were randomly selected to form the training dataset. All non-background samples were used to form the testing dataset. The support vector machine was applied to classify these testing samples based on selected feature subsets. According to the experiments on the Indian Pine Site data set with 220 bands, the highest accuracies by applying the proposed method, F-score, and HSIC are 0.8795, 0.8795, and 0.87404, respectively. However, the proposed method selects 158 features. F-score and HSIC select 168 features and 217 features, respectively. Moreover, the classification accuracies increase dramatically only using first few features. The classification accuracies with respect to feature subsets of 10 features, 20 features, 50 features, and 110 features are 0.69587, 0.7348, 0.79217, and 0.84164, respectively. Furthermore, only using half selected features (110 features) of the proposed method, the corresponding classification accuracy (0.84168) is approximate to the highest classification accuracy, 0.8795. For other two hyperspectral image data sets, the PAVIA data set and Salinas A data set, we can obtain the similar results. These results illustrate our proposed method can efficiently find feature subsets to improve hyperspectral image classification. One can apply the proposed method to determine the suitable feature subset first according to specific purposes. Then researchers can only use the corresponding sensors to obtain the hyperspectral image and classify the samples. This can not only improve the classification performance but also reduce the cost for obtaining hyperspectral images.

Keywords: hyperspectral image classification, nonlinear feature selection, kernel trick, support vector machine

Procedia PDF Downloads 241
4451 Personal Information Classification Based on Deep Learning in Automatic Form Filling System

Authors: Shunzuo Wu, Xudong Luo, Yuanxiu Liao

Abstract:

Recently, the rapid development of deep learning makes artificial intelligence (AI) penetrate into many fields, replacing manual work there. In particular, AI systems also become a research focus in the field of automatic office. To meet real needs in automatic officiating, in this paper we develop an automatic form filling system. Specifically, it uses two classical neural network models and several word embedding models to classify various relevant information elicited from the Internet. When training the neural network models, we use less noisy and balanced data for training. We conduct a series of experiments to test my systems and the results show that our system can achieve better classification results.

Keywords: artificial intelligence and office, NLP, deep learning, text classification

Procedia PDF Downloads 154
4450 Peptide-Based Platform for Differentiation of Antigenic Variations within Influenza Virus Subtypes (Flutype)

Authors: Henry Memczak, Marc Hovestaedt, Bernhard Ay, Sandra Saenger, Thorsten Wolff, Frank F. Bier

Abstract:

The influenza viruses cause flu epidemics every year and serious pandemics in larger time intervals. The only cost-effective protection against influenza is vaccination. Due to rapid mutation continuously new subtypes appear, what requires annual reimmunization. For a correct vaccination recommendation, the circulating influenza strains had to be detected promptly and exactly and characterized due to their antigenic properties. During the flu season 2016/17, a wrong vaccination recommendation has been given because of the great time interval between identification of the relevant influenza vaccine strains and outbreak of the flu epidemic during the following winter. Due to such recurring incidents of vaccine mismatches, there is a great need to speed up the process chain from identifying the right vaccine strains to their administration. The monitoring of subtypes as part of this process chain is carried out by national reference laboratories within the WHO Global Influenza Surveillance and Response System (GISRS). To this end, thousands of viruses from patient samples (e.g., throat smears) are isolated and analyzed each year. Currently, this analysis involves complex and time-intensive (several weeks) animal experiments to produce specific hyperimmune sera in ferrets, which are necessary for the determination of the antigen profiles of circulating virus strains. These tests also bear difficulties in standardization and reproducibility, which restricts the significance of the results. To replace this test a peptide-based assay for influenza virus subtyping from corresponding virus samples was developed. The differentiation of the viruses takes place by a set of specifically designed peptidic recognition molecules which interact differently with the different influenza virus subtypes. The differentiation of influenza subtypes is performed by pattern recognition guided by machine learning algorithms, without any animal experiments. Synthetic peptides are immobilized in multiplex format on various platforms (e.g., 96-well microtiter plate, microarray). Afterwards, the viruses are incubated and analyzed comparing different signaling mechanisms and a variety of assay conditions. Differentiation of a range of influenza subtypes, including H1N1, H3N2, H5N1, as well as fine differentiation of single strains within these subtypes is possible using the peptide-based subtyping platform. Thereby, the platform could be capable of replacing the current antigenic characterization of influenza strains using ferret hyperimmune sera.

Keywords: antigenic characterization, influenza-binding peptides, influenza subtyping, influenza surveillance

Procedia PDF Downloads 128
4449 Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine

Authors: Bingchun Liu, Pei-Chann Chang, Natasha Huang, Dun Li

Abstract:

Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.

Keywords: machine learning, air quality classification, air quality index, information gain, support vector machine, cross-validation

Procedia PDF Downloads 202
4448 Plant Identification Using Convolution Neural Network and Vision Transformer-Based Models

Authors: Virender Singh, Mathew Rees, Simon Hampton, Sivaram Annadurai

Abstract:

Plant identification is a challenging task that aims to identify the family, genus, and species according to plant morphological features. Automated deep learning-based computer vision algorithms are widely used for identifying plants and can help users narrow down the possibilities. However, numerous morphological similarities between and within species render correct classification difficult. In this paper, we tested custom convolution neural network (CNN) and vision transformer (ViT) based models using the PyTorch framework to classify plants. We used a large dataset of 88,000 provided by the Royal Horticultural Society (RHS) and a smaller dataset of 16,000 images from the PlantClef 2015 dataset for classifying plants at genus and species levels, respectively. Our results show that for classifying plants at the genus level, ViT models perform better compared to CNN-based models ResNet50 and ResNet-RS-420 and other state-of-the-art CNN-based models suggested in previous studies on a similar dataset. ViT model achieved top accuracy of 83.3% for classifying plants at the genus level. For classifying plants at the species level, ViT models perform better compared to CNN-based models ResNet50 and ResNet-RS-420, with a top accuracy of 92.5%. We show that the correct set of augmentation techniques plays an important role in classification success. In conclusion, these results could help end users, professionals and the general public alike in identifying plants quicker and with improved accuracy.

Keywords: plant identification, CNN, image processing, vision transformer, classification

Procedia PDF Downloads 67