Search results for: ultrafine classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2129

Search results for: ultrafine classification

1829 Review of Different Machine Learning Algorithms

Authors: Syed Romat Ali Shah, Bilal Shoaib, Saleem Akhtar, Munib Ahmad, Shahan Sadiqui

Abstract:

Classification is a data mining technique, which is recognizedon Machine Learning (ML) algorithm. It is used to classifythe individual articlein a knownofinformation into a set of predefinemodules or group. Web mining is also a portion of that sympathetic of data mining methods. The main purpose of this paper to analysis and compare the performance of Naïve Bayse Algorithm, Decision Tree, K-Nearest Neighbor (KNN), Artificial Neural Network (ANN)and Support Vector Machine (SVM). This paper consists of different ML algorithm and their advantages and disadvantages and also define research issues.

Keywords: Data Mining, Web Mining, classification, ML Algorithms

Procedia PDF Downloads 254
1828 The Optimization of Decision Rules in Multimodal Decision-Level Fusion Scheme

Authors: Andrey V. Timofeev, Dmitry V. Egorov

Abstract:

This paper introduces an original method of parametric optimization of the structure for multimodal decision-level fusion scheme which combines the results of the partial solution of the classification task obtained from assembly of the mono-modal classifiers. As a result, a multimodal fusion classifier which has the minimum value of the total error rate has been obtained.

Keywords: classification accuracy, fusion solution, total error rate, multimodal fusion classifier

Procedia PDF Downloads 434
1827 An Overview of the Porosity Classification in Carbonate Reservoirs and Their Challenges: An Example of Macro-Microporosity Classification from Offshore Miocene Carbonate in Central Luconia, Malaysia

Authors: Hammad T. Janjuhah, Josep Sanjuan, Mohamed K. Salah

Abstract:

Biological and chemical activities in carbonates are responsible for the complexity of the pore system. Primary porosity is generally of natural origin while secondary porosity is subject to chemical reactivity through diagenetic processes. To understand the integrated part of hydrocarbon exploration, it is necessary to understand the carbonate pore system. However, the current porosity classification scheme is limited to adequately predict the petrophysical properties of different reservoirs having various origins and depositional environments. Rock classification provides a descriptive method for explaining the lithofacies but makes no significant contribution to the application of porosity and permeability (poro-perm) correlation. The Central Luconia carbonate system (Malaysia) represents a good example of pore complexity (in terms of nature and origin) mainly related to diagenetic processes which have altered the original reservoir. For quantitative analysis, 32 high-resolution images of each thin section were taken using transmitted light microscopy. The quantification of grains, matrix, cement, and macroporosity (pore types) was achieved using a petrographic analysis of thin sections and FESEM images. The point counting technique was used to estimate the amount of macroporosity from thin section, which was then subtracted from the total porosity to derive the microporosity. The quantitative observation of thin sections revealed that the mouldic porosity (macroporosity) is the dominant porosity type present, whereas the microporosity seems to correspond to a sum of 40 to 50% of the total porosity. It has been proven that these Miocene carbonates contain a significant amount of microporosity, which significantly complicates the estimation and production of hydrocarbons. Neglecting its impact can increase uncertainty about estimating hydrocarbon reserves. Due to the diversity of geological parameters, the application of existing porosity classifications does not allow a better understanding of the poro-perm relationship. However, the classification can be improved by including the pore types and pore structures where they can be divided into macro- and microporosity. Such studies of microporosity identification/classification represent now a major concern in limestone reservoirs around the world.

Keywords: overview of porosity classification, reservoir characterization, microporosity, carbonate reservoir

Procedia PDF Downloads 120
1826 Using Time Series NDVI to Model Land Cover Change: A Case Study in the Berg River Catchment Area, Western Cape, South Africa

Authors: Adesuyi Ayodeji Steve, Zahn Munch

Abstract:

This study investigates the use of MODIS NDVI to identify agricultural land cover change areas on an annual time step (2007 - 2012) and characterize the trend in the study area. An ISODATA classification was performed on the MODIS imagery to select only the agricultural class producing 3 class groups namely: agriculture, agriculture/semi-natural, and semi-natural. NDVI signatures were created for the time series to identify areas dominated by cereals and vineyards with the aid of ancillary, pictometry and field sample data. The NDVI signature curve and training samples aided in creating a decision tree model in WEKA 3.6.9. From the training samples two classification models were built in WEKA using decision tree classifier (J48) algorithm; Model 1 included ISODATA classification and Model 2 without, both having accuracies of 90.7% and 88.3% respectively. The two models were used to classify the whole study area, thus producing two land cover maps with Model 1 and 2 having classification accuracies of 77% and 80% respectively. Model 2 was used to create change detection maps for all the other years. Subtle changes and areas of consistency (unchanged) were observed in the agricultural classes and crop practices over the years as predicted by the land cover classification. 41% of the catchment comprises of cereals with 35% possibly following a crop rotation system. Vineyard largely remained constant over the years, with some conversion to vineyard (1%) from other land cover classes. Some of the changes might be as a result of misclassification and crop rotation system.

Keywords: change detection, land cover, modis, NDVI

Procedia PDF Downloads 372
1825 Ontology-Based Backpropagation Neural Network Classification and Reasoning Strategy for NoSQL and SQL Databases

Authors: Hao-Hsiang Ku, Ching-Ho Chi

Abstract:

Big data applications have become an imperative for many fields. Many researchers have been devoted into increasing correct rates and reducing time complexities. Hence, the study designs and proposes an Ontology-based backpropagation neural network classification and reasoning strategy for NoSQL big data applications, which is called ON4NoSQL. ON4NoSQL is responsible for enhancing the performances of classifications in NoSQL and SQL databases to build up mass behavior models. Mass behavior models are made by MapReduce techniques and Hadoop distributed file system based on Hadoop service platform. The reference engine of ON4NoSQL is the ontology-based backpropagation neural network classification and reasoning strategy. Simulation results indicate that ON4NoSQL can efficiently achieve to construct a high performance environment for data storing, searching, and retrieving.

Keywords: Hadoop, NoSQL, ontology, back propagation neural network, high distributed file system

Procedia PDF Downloads 234
1824 Synthesis and Characterization of Amino-Functionalized Polystyrene Nanoparticles as Reactive Filler

Authors: Yaseen Elhebshi, Abdulkareem Hamid, Nureddin Bin Issa, Xiaonong Chen

Abstract:

A convenient method of preparing ultrafine polystyrene latex nano-particles with amino groups on the surface is developed. Polystyrene latexes in the size range 50–400 nm were prepared via emulsion polymerization, using sodium dodecyl sulfate (SDS) as surfactant. Polystyrene with amino groups on the surface will be fine to use as organic filler to modify rubber. Transmission electron microscopy (TEM) was used to observe the morphology of silicon dioxide and functionalized polystyrene nano-particles. The nature of bonding between the polymer and the reactive groups on the filler surfaces was analyzed using Fourier transform infrared spectroscopy (FTIR). Scanning electron microscopy (SEM) was employed to examine the filler surface.

Keywords: reactive filler, emulsion polymerization, particle size, polystyrene nanoparticles

Procedia PDF Downloads 327
1823 Land Use Change Detection Using Satellite Images for Najran City, Kingdom of Saudi Arabia (KSA)

Authors: Ismail Elkhrachy

Abstract:

Determination of land use changing is an important component of regional planning for applications ranging from urban fringe change detection to monitoring change detection of land use. This data are very useful for natural resources management.On the other hand, the technologies and methods of change detection also have evolved dramatically during past 20 years. So it has been well recognized that the change detection had become the best methods for researching dynamic change of land use by multi-temporal remotely-sensed data. The objective of this paper is to assess, evaluate and monitor land use change surrounding the area of Najran city, Kingdom of Saudi Arabia (KSA) using Landsat images (June 23, 2009) and ETM+ image(June. 21, 2014). The post-classification change detection technique was applied. At last,two-time subset images of Najran city are compared on a pixel-by-pixel basis using the post-classification comparison method and the from-to change matrix is produced, the land use change information obtained.Three classes were obtained, urban, bare land and agricultural land from unsupervised classification method by using Erdas Imagine and ArcGIS software. Accuracy assessment of classification has been performed before calculating change detection for study area. The obtained accuracy is between 61% to 87% percent for all the classes. Change detection analysis shows that rapid growth in urban area has been increased by 73.2%, the agricultural area has been decreased by 10.5 % and barren area reduced by 7% between 2009 and 2014. The quantitative study indicated that the area of urban class has unchanged by 58.2 km〗^2, gained 70.3 〖km〗^2 and lost 16 〖km〗^2. For bare land class 586.4〖km〗^2 has unchanged, 53.2〖km〗^2 has gained and 101.5〖km〗^2 has lost. While agriculture area class, 20.2〖km〗^2 has unchanged, 31.2〖km〗^2 has gained and 37.2〖km〗^2 has lost.

Keywords: land use, remote sensing, change detection, satellite images, image classification

Procedia PDF Downloads 500
1822 The Necessity to Standardize Procedures of Providing Engineering Geological Data for Designing Road and Railway Tunneling Projects

Authors: Atefeh Saljooghi Khoshkar, Jafar Hassanpour

Abstract:

One of the main problems of the design stage relating to many tunneling projects is the lack of an appropriate standard for the provision of engineering geological data in a predefined format. In particular, this is more reflected in highway and railroad tunnel projects in which there is a number of tunnels and different professional teams involved. In this regard, comprehensive software needs to be designed using the accepted methods in order to help engineering geologists to prepare standard reports, which contain sufficient input data for the design stage. Regarding this necessity, applied software has been designed using macro capabilities and Visual Basic programming language (VBA) through Microsoft Excel. In this software, all of the engineering geological input data, which are required for designing different parts of tunnels, such as discontinuities properties, rock mass strength parameters, rock mass classification systems, boreability classification, the penetration rate, and so forth, can be calculated and reported in a standard format.

Keywords: engineering geology, rock mass classification, rock mechanic, tunnel

Procedia PDF Downloads 45
1821 Defect Classification of Hydrogen Fuel Pressure Vessels using Deep Learning

Authors: Dongju Kim, Youngjoo Suh, Hyojin Kim, Gyeongyeong Kim

Abstract:

Acoustic Emission Testing (AET) is widely used to test the structural integrity of an operational hydrogen storage container, and clustering algorithms are frequently used in pattern recognition methods to interpret AET results. However, the interpretation of AET results can vary from user to user as the tuning of the relevant parameters relies on the user's experience and knowledge of AET. Therefore, it is necessary to use a deep learning model to identify patterns in acoustic emission (AE) signal data that can be used to classify defects instead. In this paper, a deep learning-based model for classifying the types of defects in hydrogen storage tanks, using AE sensor waveforms, is proposed. As hydrogen storage tanks are commonly constructed using carbon fiber reinforced polymer composite (CFRP), a defect classification dataset is collected through a tensile test on a specimen of CFRP with an AE sensor attached. The performance of the classification model, using one-dimensional convolutional neural network (1-D CNN) and synthetic minority oversampling technique (SMOTE) data augmentation, achieved 91.09% accuracy for each defect. It is expected that the deep learning classification model in this paper, used with AET, will help in evaluating the operational safety of hydrogen storage containers.

Keywords: acoustic emission testing, carbon fiber reinforced polymer composite, one-dimensional convolutional neural network, smote data augmentation

Procedia PDF Downloads 64
1820 Classification of Manufacturing Data for Efficient Processing on an Edge-Cloud Network

Authors: Onyedikachi Ulelu, Andrew P. Longstaff, Simon Fletcher, Simon Parkinson

Abstract:

The widespread interest in 'Industry 4.0' or 'digital manufacturing' has led to significant research requiring the acquisition of data from sensors, instruments, and machine signals. In-depth research then identifies methods of analysis of the massive amounts of data generated before and during manufacture to solve a particular problem. The ultimate goal is for industrial Internet of Things (IIoT) data to be processed automatically to assist with either visualisation or autonomous system decision-making. However, the collection and processing of data in an industrial environment come with a cost. Little research has been undertaken on how to specify optimally what data to capture, transmit, process, and store at various levels of an edge-cloud network. The first step in this specification is to categorise IIoT data for efficient and effective use. This paper proposes the required attributes and classification to take manufacturing digital data from various sources to determine the most suitable location for data processing on the edge-cloud network. The proposed classification framework will minimise overhead in terms of network bandwidth/cost and processing time of machine tool data via efficient decision making on which dataset should be processed at the ‘edge’ and what to send to a remote server (cloud). A fast-and-frugal heuristic method is implemented for this decision-making. The framework is tested using case studies from industrial machine tools for machine productivity and maintenance.

Keywords: data classification, decision making, edge computing, industrial IoT, industry 4.0

Procedia PDF Downloads 149
1819 A Statistical Approach to Predict and Classify the Commercial Hatchability of Chickens Using Extrinsic Parameters of Breeders and Eggs

Authors: M. S. Wickramarachchi, L. S. Nawarathna, C. M. B. Dematawewa

Abstract:

Hatchery performance is critical for the profitability of poultry breeder operations. Some extrinsic parameters of eggs and breeders cause to increase or decrease the hatchability. This study aims to identify the affecting extrinsic parameters on the commercial hatchability of local chicken's eggs and determine the most efficient classification model with a hatchability rate greater than 90%. In this study, seven extrinsic parameters were considered: egg weight, moisture loss, breeders age, number of fertilised eggs, shell width, shell length, and shell thickness. Multiple linear regression was performed to determine the most influencing variable on hatchability. First, the correlation between each parameter and hatchability were checked. Then a multiple regression model was developed, and the accuracy of the fitted model was evaluated. Linear Discriminant Analysis (LDA), Classification and Regression Trees (CART), k-Nearest Neighbors (kNN), Support Vector Machines (SVM) with a linear kernel, and Random Forest (RF) algorithms were applied to classify the hatchability. This grouping process was conducted using binary classification techniques. Hatchability was negatively correlated with egg weight, breeders' age, shell width, shell length, and positive correlations were identified with moisture loss, number of fertilised eggs, and shell thickness. Multiple linear regression models were more accurate than single linear models regarding the highest coefficient of determination (R²) with 94% and minimum AIC and BIC values. According to the classification results, RF, CART, and kNN had performed the highest accuracy values 0.99, 0.975, and 0.972, respectively, for the commercial hatchery process. Therefore, the RF is the most appropriate machine learning algorithm for classifying the breeder outcomes, which are economically profitable or not, in a commercial hatchery.

Keywords: classification models, egg weight, fertilised eggs, multiple linear regression

Procedia PDF Downloads 58
1818 Local Directional Encoded Derivative Binary Pattern Based Coral Image Classification Using Weighted Distance Gray Wolf Optimization Algorithm

Authors: Annalakshmi G., Sakthivel Murugan S.

Abstract:

This paper presents a local directional encoded derivative binary pattern (LDEDBP) feature extraction method that can be applied for the classification of submarine coral reef images. The classification of coral reef images using texture features is difficult due to the dissimilarities in class samples. In coral reef image classification, texture features are extracted using the proposed method called local directional encoded derivative binary pattern (LDEDBP). The proposed approach extracts the complete structural arrangement of the local region using local binary batten (LBP) and also extracts the edge information using local directional pattern (LDP) from the edge response available in a particular region, thereby achieving extra discriminative feature value. Typically the LDP extracts the edge details in all eight directions. The process of integrating edge responses along with the local binary pattern achieves a more robust texture descriptor than the other descriptors used in texture feature extraction methods. Finally, the proposed technique is applied to an extreme learning machine (ELM) method with a meta-heuristic algorithm known as weighted distance grey wolf optimizer (GWO) to optimize the input weight and biases of single-hidden-layer feed-forward neural networks (SLFN). In the empirical results, ELM-WDGWO demonstrated their better performance in terms of accuracy on all coral datasets, namely RSMAS, EILAT, EILAT2, and MLC, compared with other state-of-the-art algorithms. The proposed method achieves the highest overall classification accuracy of 94% compared to the other state of art methods.

Keywords: feature extraction, local directional pattern, ELM classifier, GWO optimization

Procedia PDF Downloads 137
1817 Kannada HandWritten Character Recognition by Edge Hinge and Edge Distribution Techniques Using Manhatan and Minimum Distance Classifiers

Authors: C. V. Aravinda, H. N. Prakash

Abstract:

In this paper, we tried to convey fusion and state of art pertaining to SIL character recognition systems. In the first step, the text is preprocessed and normalized to perform the text identification correctly. The second step involves extracting relevant and informative features. The third step implements the classification decision. The three stages which involved are Data acquisition and preprocessing, Feature extraction, and Classification. Here we concentrated on two techniques to obtain features, Feature Extraction & Feature Selection. Edge-hinge distribution is a feature that characterizes the changes in direction of a script stroke in handwritten text. The edge-hinge distribution is extracted by means of a windowpane that is slid over an edge-detected binary handwriting image. Whenever the mid pixel of the window is on, the two edge fragments (i.e. connected sequences of pixels) emerging from this mid pixel are measured. Their directions are measured and stored as pairs. A joint probability distribution is obtained from a large sample of such pairs. Despite continuous effort, handwriting identification remains a challenging issue, due to different approaches use different varieties of features, having different. Therefore, our study will focus on handwriting recognition based on feature selection to simplify features extracting task, optimize classification system complexity, reduce running time and improve the classification accuracy.

Keywords: word segmentation and recognition, character recognition, optical character recognition, hand written character recognition, South Indian languages

Procedia PDF Downloads 470
1816 Music Genre Classification Based on Non-Negative Matrix Factorization Features

Authors: Soyon Kim, Edward Kim

Abstract:

In order to retrieve information from the massive stream of songs in the music industry, music search by title, lyrics, artist, mood, and genre has become more important. Despite the subjectivity and controversy over the definition of music genres across different nations and cultures, automatic genre classification systems that facilitate the process of music categorization have been developed. Manual genre selection by music producers is being provided as statistical data for designing automatic genre classification systems. In this paper, an automatic music genre classification system utilizing non-negative matrix factorization (NMF) is proposed. Short-term characteristics of the music signal can be captured based on the timbre features such as mel-frequency cepstral coefficient (MFCC), decorrelated filter bank (DFB), octave-based spectral contrast (OSC), and octave band sum (OBS). Long-term time-varying characteristics of the music signal can be summarized with (1) the statistical features such as mean, variance, minimum, and maximum of the timbre features and (2) the modulation spectrum features such as spectral flatness measure, spectral crest measure, spectral peak, spectral valley, and spectral contrast of the timbre features. Not only these conventional basic long-term feature vectors, but also NMF based feature vectors are proposed to be used together for genre classification. In the training stage, NMF basis vectors were extracted for each genre class. The NMF features were calculated in the log spectral magnitude domain (NMF-LSM) as well as in the basic feature vector domain (NMF-BFV). For NMF-LSM, an entire full band spectrum was used. However, for NMF-BFV, only low band spectrum was used since high frequency modulation spectrum of the basic feature vectors did not contain important information for genre classification. In the test stage, using the set of pre-trained NMF basis vectors, the genre classification system extracted the NMF weighting values of each genre as the NMF feature vectors. A support vector machine (SVM) was used as a classifier. The GTZAN multi-genre music database was used for training and testing. It is composed of 10 genres and 100 songs for each genre. To increase the reliability of the experiments, 10-fold cross validation was used. For a given input song, an extracted NMF-LSM feature vector was composed of 10 weighting values that corresponded to the classification probabilities for 10 genres. An NMF-BFV feature vector also had a dimensionality of 10. Combined with the basic long-term features such as statistical features and modulation spectrum features, the NMF features provided the increased accuracy with a slight increase in feature dimensionality. The conventional basic features by themselves yielded 84.0% accuracy, but the basic features with NMF-LSM and NMF-BFV provided 85.1% and 84.2% accuracy, respectively. The basic features required dimensionality of 460, but NMF-LSM and NMF-BFV required dimensionalities of 10 and 10, respectively. Combining the basic features, NMF-LSM and NMF-BFV together with the SVM with a radial basis function (RBF) kernel produced the significantly higher classification accuracy of 88.3% with a feature dimensionality of 480.

Keywords: mel-frequency cepstral coefficient (MFCC), music genre classification, non-negative matrix factorization (NMF), support vector machine (SVM)

Procedia PDF Downloads 265
1815 Decision Making System for Clinical Datasets

Authors: P. Bharathiraja

Abstract:

Computer Aided decision making system is used to enhance diagnosis and prognosis of diseases and also to assist clinicians and junior doctors in clinical decision making. Medical Data used for decision making should be definite and consistent. Data Mining and soft computing techniques are used for cleaning the data and for incorporating human reasoning in decision making systems. Fuzzy rule based inference technique can be used for classification in order to incorporate human reasoning in the decision making process. In this work, missing values are imputed using the mean or mode of the attribute. The data are normalized using min-ma normalization to improve the design and efficiency of the fuzzy inference system. The fuzzy inference system is used to handle the uncertainties that exist in the medical data. Equal-width-partitioning is used to partition the attribute values into appropriate fuzzy intervals. Fuzzy rules are generated using Class Based Associative rule mining algorithm. The system is trained and tested using heart disease data set from the University of California at Irvine (UCI) Machine Learning Repository. The data was split using a hold out approach into training and testing data. From the experimental results it can be inferred that classification using fuzzy inference system performs better than trivial IF-THEN rule based classification approaches. Furthermore it is observed that the use of fuzzy logic and fuzzy inference mechanism handles uncertainty and also resembles human decision making. The system can be used in the absence of a clinical expert to assist junior doctors and clinicians in clinical decision making.

Keywords: decision making, data mining, normalization, fuzzy rule, classification

Procedia PDF Downloads 486
1814 Dual-Channel Reliable Breast Ultrasound Image Classification Based on Explainable Attribution and Uncertainty Quantification

Authors: Haonan Hu, Shuge Lei, Dasheng Sun, Huabin Zhang, Kehong Yuan, Jian Dai, Jijun Tang

Abstract:

This paper focuses on the classification task of breast ultrasound images and conducts research on the reliability measurement of classification results. A dual-channel evaluation framework was developed based on the proposed inference reliability and predictive reliability scores. For the inference reliability evaluation, human-aligned and doctor-agreed inference rationals based on the improved feature attribution algorithm SP-RISA are gracefully applied. Uncertainty quantification is used to evaluate the predictive reliability via the test time enhancement. The effectiveness of this reliability evaluation framework has been verified on the breast ultrasound clinical dataset YBUS, and its robustness is verified on the public dataset BUSI. The expected calibration errors on both datasets are significantly lower than traditional evaluation methods, which proves the effectiveness of the proposed reliability measurement.

Keywords: medical imaging, ultrasound imaging, XAI, uncertainty measurement, trustworthy AI

Procedia PDF Downloads 57
1813 A Multi-Output Network with U-Net Enhanced Class Activation Map and Robust Classification Performance for Medical Imaging Analysis

Authors: Jaiden Xuan Schraut, Leon Liu, Yiqiao Yin

Abstract:

Computer vision in medical diagnosis has achieved a high level of success in diagnosing diseases with high accuracy. However, conventional classifiers that produce an image to-label result provides insufficient information for medical professionals to judge and raise concerns over the trust and reliability of a model with results that cannot be explained. In order to gain local insight into cancerous regions, separate tasks such as imaging segmentation need to be implemented to aid the doctors in treating patients, which doubles the training time and costs which renders the diagnosis system inefficient and difficult to be accepted by the public. To tackle this issue and drive AI-first medical solutions further, this paper proposes a multi-output network that follows a U-Net architecture for image segmentation output and features an additional convolutional neural networks (CNN) module for auxiliary classification output. Class activation maps are a method of providing insight into a convolutional neural network’s feature maps that leads to its classification but in the case of lung diseases, the region of interest is enhanced by U-net-assisted Class Activation Map (CAM) visualization. Therefore, our proposed model combines image segmentation models and classifiers to crop out only the lung region of a chest X-ray’s class activation map to provide a visualization that improves the explainability and is able to generate classification results simultaneously which builds trust for AI-led diagnosis systems. The proposed U-Net model achieves 97.61% accuracy and a dice coefficient of 0.97 on testing data from the COVID-QU-Ex Dataset which includes both diseased and healthy lungs.

Keywords: multi-output network model, U-net, class activation map, image classification, medical imaging analysis

Procedia PDF Downloads 161
1812 Mechanisms Leading to the Protective Behavior of Ethanol Vapour Drying of Probiotics

Authors: Shahnaz Mansouri, Xiao Dong Chen, Meng Wai Woo

Abstract:

A new antisolvent vapour precipitation approach was used to make ultrafine submicron probiotic encapsulates. The approach uses ethanol vapour to precipitate submicron encapsulates within relatively large droplets. Surprisingly, the probiotics (Lactobacillus delbrueckii ssp. bulgaricus, Streptococcus thermophilus) showed relatively high survival even under destructive ethanolic conditions within the droplet. This unusual behaviour was deduced to be caused by the denaturation and aggregation of the milk protein forming an ethanolic protective matrix for the probiotics. Skim milk droplets which is rich in casein and contains naturally occurring minerals provided higher ethanolic protection when compared whey protein isolate and lactose droplets.

Keywords: whey, skim milk, probiotic, antisolvent, precipitation, encapsulation, denaturation, aggregation

Procedia PDF Downloads 493
1811 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 133
1810 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 122
1809 Classification of EEG Signals Based on Dynamic Connectivity Analysis

Authors: Zoran Šverko, Saša Vlahinić, Nino Stojković, Ivan Markovinović

Abstract:

In this article, the classification of target letters is performed using data from the EEG P300 Speller paradigm. Neural networks trained with the results of dynamic connectivity analysis between different brain regions are used for classification. Dynamic connectivity analysis is based on the adaptive window size and the imaginary part of the complex Pearson correlation coefficient. Brain dynamics are analysed using the relative intersection of confidence intervals for the imaginary component of the complex Pearson correlation coefficient method (RICI-imCPCC). The RICI-imCPCC method overcomes the shortcomings of currently used dynamical connectivity analysis methods, such as the low reliability and low temporal precision for short connectivity intervals encountered in constant sliding window analysis with wide window size and the high susceptibility to noise encountered in constant sliding window analysis with narrow window size. This method overcomes these shortcomings by dynamically adjusting the window size using the RICI rule. This method extracts information about brain connections for each time sample. Seventy percent of the extracted brain connectivity information is used for training and thirty percent for validation. Classification of the target word is also done and based on the same analysis method. As far as we know, through this research, we have shown for the first time that dynamic connectivity can be used as a parameter for classifying EEG signals.

Keywords: dynamic connectivity analysis, EEG, neural networks, Pearson correlation coefficients

Procedia PDF Downloads 170
1808 The Impact on the Composition of Survey Refusals΄ Demographic Profile When Implementing Different Classifications

Authors: Eva Tsouparopoulou, Maria Symeonaki

Abstract:

The internationally documented declining survey response rates of the last two decades are mainly attributed to refusals. In fieldwork, a refusal may be obtained not only from the respondent himself/herself, but from other sources on the respondent’s behalf, such as other household members, apartment building residents or administrator(s), and neighborhood residents. In this paper, we investigate how the composition of the demographic profile of survey refusals changes when different classifications are implemented and the classification issues arising from that. The analysis is based on the 2002-2018 European Social Survey (ESS) datasets for Belgium, Germany, and United Kingdom. For these three countries, the size of selected sample units coded as a type of refusal for all nine under investigation rounds was large enough to meet the purposes of the analysis. The results indicate the existence of four different possible classifications that can be implemented and the significance of choosing the one that strengthens the contrasts of the different types of respondents' demographic profiles. Since the foundation of social quantitative research lies in the triptych of definition, classification, and measurement, this study aims to identify the multiplicity of the definition of survey refusals as a methodological tool for the continually growing research on non-response.

Keywords: non-response, refusals, European social survey, classification

Procedia PDF Downloads 59
1807 Disease Level Assessment in Wheat Plots Using a Residual Deep Learning Algorithm

Authors: Felipe A. Guth, Shane Ward, Kevin McDonnell

Abstract:

The assessment of disease levels in crop fields is an important and time-consuming task that generally relies on expert knowledge of trained individuals. Image classification in agriculture problems historically has been based on classical machine learning strategies that make use of hand-engineered features in the top of a classification algorithm. This approach tends to not produce results with high accuracy and generalization to the classes classified by the system when the nature of the elements has a significant variability. The advent of deep convolutional neural networks has revolutionized the field of machine learning, especially in computer vision tasks. These networks have great resourcefulness of learning and have been applied successfully to image classification and object detection tasks in the last years. The objective of this work was to propose a new method based on deep learning convolutional neural networks towards the task of disease level monitoring. Common RGB images of winter wheat were obtained during a growing season. Five categories of disease levels presence were produced, in collaboration with agronomists, for the algorithm classification. Disease level tasks performed by experts provided ground truth data for the disease score of the same winter wheat plots were RGB images were acquired. The system had an overall accuracy of 84% on the discrimination of the disease level classes.

Keywords: crop disease assessment, deep learning, precision agriculture, residual neural networks

Procedia PDF Downloads 296
1806 Population Dynamics and Land Use/Land Cover Change on the Chilalo-Galama Mountain Range, Ethiopia

Authors: Yusuf Jundi Sado

Abstract:

Changes in land use are mostly credited to human actions that result in negative impacts on biodiversity and ecosystem functions. This study aims to analyze the dynamics of land use and land cover changes for sustainable natural resources planning and management. Chilalo-Galama Mountain Range, Ethiopia. This study used Thematic Mapper 05 (TM) for 1986, 2001 and Landsat 8 (OLI) data 2017. Additionally, data from the Central Statistics Agency on human population growth were analyzed. Semi-Automatic classification plugin (SCP) in QGIS 3.2.3 software was used for image classification. Global positioning system, field observations and focus group discussions were used for ground verification. Land Use Land Cover (LU/LC) change analysis was using maximum likelihood supervised classification and changes were calculated for the 1986–2001 and the 2001–2017 and 1986-2017 periods. The results show that agricultural land increased from 27.85% (1986) to 44.43% and 51.32% in 2001 and 2017, respectively with the overall accuracies of 92% (1986), 90.36% (2001), and 88% (2017). On the other hand, forests decreased from 8.51% (1986) to 7.64 (2001) and 4.46% (2017), and grassland decreased from 37.47% (1986) to 15.22%, and 15.01% in 2001 and 2017, respectively. It indicates for the years 1986–2017 the largest area cover gain of agricultural land was obtained from grassland. The matrix also shows that shrubland gained land from agricultural land, afro-alpine, and forest land. Population dynamics is found to be one of the major driving forces for the LU/LU changes in the study area.

Keywords: Landsat, LU/LC change, Semi-Automatic classification plugin, population dynamics, Ethiopia

Procedia PDF Downloads 52
1805 Clinical Feature Analysis and Prediction on Recurrence in Cervical Cancer

Authors: Ravinder Bahl, Jamini Sharma

Abstract:

The paper demonstrates analysis of the cervical cancer based on a probabilistic model. It involves technique for classification and prediction by recognizing typical and diagnostically most important test features relating to cervical cancer. The main contributions of the research include predicting the probability of recurrences in no recurrence (first time detection) cases. The combination of the conventional statistical and machine learning tools is applied for the analysis. Experimental study with real data demonstrates the feasibility and potential of the proposed approach for the said cause.

Keywords: cervical cancer, recurrence, no recurrence, probabilistic, classification, prediction, machine learning

Procedia PDF Downloads 327
1804 A Machine Learning Approach for Classification of Directional Valve Leakage in the Hydraulic Final Test

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Due to increasing cost pressure in global markets, artificial intelligence is becoming a technology that is decisive for competition. Predictive quality enables machinery and plant manufacturers to ensure product quality by using data-driven forecasts via machine learning models as a decision-making basis for test results. The use of cross-process Bosch production data along the value chain of hydraulic valves is a promising approach to classifying the quality characteristics of workpieces.

Keywords: predictive quality, hydraulics, machine learning, classification, supervised learning

Procedia PDF Downloads 203
1803 Time-Frequency Feature Extraction Method Based on Micro-Doppler Signature of Ground Moving Targets

Authors: Ke Ren, Huiruo Shi, Linsen Li, Baoshuai Wang, Yu Zhou

Abstract:

Since some discriminative features are required for ground moving targets classification, we propose a new feature extraction method based on micro-Doppler signature. Firstly, the time-frequency analysis of measured data indicates that the time-frequency spectrograms of the three kinds of ground moving targets, i.e., single walking person, two people walking and a moving wheeled vehicle, are discriminative. Then, a three-dimensional time-frequency feature vector is extracted from the time-frequency spectrograms to depict these differences. At last, a Support Vector Machine (SVM) classifier is trained with the proposed three-dimensional feature vector. The classification accuracy to categorize ground moving targets into the three kinds of the measured data is found to be over 96%, which demonstrates the good discriminative ability of the proposed micro-Doppler feature.

Keywords: micro-doppler, time-frequency analysis, feature extraction, radar target classification

Procedia PDF Downloads 378
1802 Clustering the Wheat Seeds Using SOM Artificial Neural Networks

Authors: Salah Ghamari

Abstract:

In this study, the ability of self organizing map artificial (SOM) neural networks in clustering the wheat seeds varieties according to morphological properties of them was considered. The SOM is one type of unsupervised competitive learning. Experimentally, five morphological features of 300 seeds (including three varieties: gaskozhen, Md and sardari) were obtained using image processing technique. The results show that the artificial neural network has a good performance (90.33% accuracy) in classification of the wheat varieties despite of high similarity in them. The highest classification accuracy (100%) was achieved for sardari.

Keywords: artificial neural networks, clustering, self organizing map, wheat variety

Procedia PDF Downloads 610
1801 SEM Image Classification Using CNN Architectures

Authors: Güzi̇n Ti̇rkeş, Özge Teki̇n, Kerem Kurtuluş, Y. Yekta Yurtseven, Murat Baran

Abstract:

A scanning electron microscope (SEM) is a type of electron microscope mainly used in nanoscience and nanotechnology areas. Automatic image recognition and classification are among the general areas of application concerning SEM. In line with these usages, the present paper proposes a deep learning algorithm that classifies SEM images into nine categories by means of an online application to simplify the process. The NFFA-EUROPE - 100% SEM data set, containing approximately 21,000 images, was used to train and test the algorithm at 80% and 20%, respectively. Validation was carried out using a separate data set obtained from the Middle East Technical University (METU) in Turkey. To increase the accuracy in the results, the Inception ResNet-V2 model was used in view of the Fine-Tuning approach. By using a confusion matrix, it was observed that the coated-surface category has a negative effect on the accuracy of the results since it contains other categories in the data set, thereby confusing the model when detecting category-specific patterns. For this reason, the coated-surface category was removed from the train data set, hence increasing accuracy by up to 96.5%.

Keywords: convolutional neural networks, deep learning, image classification, scanning electron microscope

Procedia PDF Downloads 84
1800 Mixed Integer Programming-Based One-Class Classification Method for Process Monitoring

Authors: Younghoon Kim, Seoung Bum Kim

Abstract:

One-class classification plays an important role in detecting outlier and abnormality from normal observations. In the previous research, several attempts were made to extend the scope of application of the one-class classification techniques to statistical process control problems. For most previous approaches, such as support vector data description (SVDD) control chart, the design of the control limits is commonly based on the assumption that the proportion of abnormal observations is approximately equal to an expected Type I error rate in Phase I process. Because of the limitation of the one-class classification techniques based on convex optimization, we cannot make the proportion of abnormal observations exactly equal to expected Type I error rate: controlling Type I error rate requires to optimize constraints with integer decision variables, but convex optimization cannot satisfy the requirement. This limitation would be undesirable in theoretical and practical perspective to construct effective control charts. In this work, to address the limitation of previous approaches, we propose the one-class classification algorithm based on the mixed integer programming technique, which can solve problems formulated with continuous and integer decision variables. The proposed method minimizes the radius of a spherically shaped boundary subject to the number of normal data to be equal to a constant value specified by users. By modifying this constant value, users can exactly control the proportion of normal data described by the spherically shaped boundary. Thus, the proportion of abnormal observations can be made theoretically equal to an expected Type I error rate in Phase I process. Moreover, analogous to SVDD, the boundary can be made to describe complex structures by using some kernel functions. New multivariate control chart applying the effectiveness of the algorithm is proposed. This chart uses a monitoring statistic to characterize the degree of being an abnormal point as obtained through the proposed one-class classification. The control limit of the proposed chart is established by the radius of the boundary. The usefulness of the proposed method was demonstrated through experiments with simulated and real process data from a thin film transistor-liquid crystal display.

Keywords: control chart, mixed integer programming, one-class classification, support vector data description

Procedia PDF Downloads 150