Search results for: kernel k-medoids
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 242

Search results for: kernel k-medoids

152 A Study on the Performance of 2-PC-D Classification Model

Authors: Nurul Aini Abdul Wahab, Nor Syamim Halidin, Sayidatina Aisah Masnan, Nur Izzati Romli

Abstract:

There are many applications of principle component method for reducing the large set of variables in various fields. Fisher’s Discriminant function is also a popular tool for classification. In this research, the researcher focuses on studying the performance of Principle Component-Fisher’s Discriminant function in helping to classify rice kernels to their defined classes. The data were collected on the smells or odour of the rice kernel using odour-detection sensor, Cyranose. 32 variables were captured by this electronic nose (e-nose). The objective of this research is to measure how well a combination model, between principle component and linear discriminant, to be as a classification model. Principle component method was used to reduce all 32 variables to a smaller and manageable set of components. Then, the reduced components were used to develop the Fisher’s Discriminant function. In this research, there are 4 defined classes of rice kernel which are Aromatic, Brown, Ordinary and Others. Based on the output from principle component method, the 32 variables were reduced to only 2 components. Based on the output of classification table from the discriminant analysis, 40.76% from the total observations were correctly classified into their classes by the PC-Discriminant function. Indirectly, it gives an idea that the classification model developed has committed to more than 50% of misclassifying the observations. As a conclusion, the Fisher’s Discriminant function that was built on a 2-component from PCA (2-PC-D) is not satisfying to classify the rice kernels into its defined classes.

Keywords: classification model, discriminant function, principle component analysis, variable reduction

Procedia PDF Downloads 305
151 TACTICAL: Ram Image Retrieval in Linux Using Protected Mode Architecture’s Paging Technique

Authors: Sedat Aktas, Egemen Ulusoy, Remzi Yildirim

Abstract:

This article explains how to get a ram image from a computer with a Linux operating system and what steps should be followed while getting it. What we mean by taking a ram image is the process of dumping the physical memory instantly and writing it to a file. This process can be likened to taking a picture of everything in the computer’s memory at that moment. This process is very important for tools that analyze ram images. Volatility can be given as an example because before these tools can analyze ram, images must be taken. These tools are used extensively in the forensic world. Forensic, on the other hand, is a set of processes for digitally examining the information on any computer or server on behalf of official authorities. In this article, the protected mode architecture in the Linux operating system is examined, and the way to save the image sample of the kernel driver and system memory to disk is followed. Tables and access methods to be used in the operating system are examined based on the basic architecture of the operating system, and the most appropriate methods and application methods are transferred to the article. Since there is no article directly related to this study on Linux in the literature, it is aimed to contribute to the literature with this study on obtaining ram images. LIME can be mentioned as a similar tool, but there is no explanation about the memory dumping method of this tool. Considering the frequency of use of these tools, the contribution of the study in the field of forensic medicine has been the main motivation of the study due to the intense studies on ram image in the field of forensics.

Keywords: linux, paging, addressing, ram-image, memory dumping, kernel modules, forensic

Procedia PDF Downloads 77
150 An Assessment of Health Hazards in Urban Communities: A Study of Spatial-Temporal Variations of Dengue Epidemic in Colombo, Sri Lanka

Authors: U. Thisara G. Perera, C. M. Kanchana N. K. Chandrasekara

Abstract:

Dengue is an epidemic which is spread by Aedes Egyptai and Aedes Albopictus mosquitoes. The cases of dengue show a dramatic growth rate of the epidemic in urban and semi urban areas spatially in tropical and sub-tropical regions of the world. Incidence of dengue has become a prominent reason for hospitalization and deaths in Asian countries, including Sri Lanka. During the last decade the dengue epidemic began to spread from urban to semi-urban and then to rural settings of the country. The highest number of dengue infected patients was recorded in Sri Lanka in the year 2016 and the highest number of patients was identified in Colombo district. Together with the commercial, industrial, and other supporting services, the district suffers from rapid urbanization and high population density. Thus, drainage and waste disposal patterns of the people in this area exert an additional pressure to the environment. The district is situated in the wet zone and thus low lying lands constitute the largest portion of the district. This situation additionally facilitates mosquito breeding sites. Therefore, the purpose of the present study was to assess the spatial and temporal distribution patterns of dengue epidemic in Kolonnawa MOH area (Medical Officer of Health) in the district of Colombo. The study was carried out using 615 recorded dengue cases in Kollonnawa MOH area during the south east monsoon season from May to September 2016. The Moran’s I and Kernel density estimation were used as analytical methods. The analysis of data was accomplished through the integrated use of ArcGIS 10.1 software packages along with Microsoft Excel analytical tool. Field observation was also carried out for verification purposes during the study period. Results of the Moran’s I index indicates that the spatial distribution of dengue cases showed a cluster distribution pattern across the area. Kernel density estimation emphasis that dengue cases are high where the population has gathered, especially in areas comprising housing schemes. Results of the Kernel Density estimation further discloses that hot spots of dengue epidemic are located in the western half of the Kolonnawa MOH area, which is close to the Colombo municipal boundary and there is a significant relationship with high population density and unplanned urban land use practices. Results of the field observation confirm that the drainage systems in these areas function poorly and careless waste disposal methods of the people further encourage mosquito breeding sites. This situation has evolved harmfully from a public health issue to a social problem, which ultimately impacts on the economy and social lives of the country.

Keywords: Dengue epidemic, health hazards, Kernel density, Moran’s I, Sri Lanka

Procedia PDF Downloads 274
149 A Hierarchical Method for Multi-Class Probabilistic Classification Vector Machines

Authors: P. Byrnes, F. A. DiazDelaO

Abstract:

The Support Vector Machine (SVM) has become widely recognised as one of the leading algorithms in machine learning for both regression and binary classification. It expresses predictions in terms of a linear combination of kernel functions, referred to as support vectors. Despite its popularity amongst practitioners, SVM has some limitations, with the most significant being the generation of point prediction as opposed to predictive distributions. Stemming from this issue, a probabilistic model namely, Probabilistic Classification Vector Machines (PCVM), has been proposed which respects the original functional form of SVM whilst also providing a predictive distribution. As physical system designs become more complex, an increasing number of classification tasks involving industrial applications consist of more than two classes. Consequently, this research proposes a framework which allows for the extension of PCVM to a multi class setting. Additionally, the original PCVM framework relies on the use of type II maximum likelihood to provide estimates for both the kernel hyperparameters and model evidence. In a high dimensional multi class setting, however, this approach has been shown to be ineffective due to bad scaling as the number of classes increases. Accordingly, we propose the application of Markov Chain Monte Carlo (MCMC) based methods to provide a posterior distribution over both parameters and hyperparameters. The proposed framework will be validated against current multi class classifiers through synthetic and real life implementations.

Keywords: probabilistic classification vector machines, multi class classification, MCMC, support vector machines

Procedia PDF Downloads 201
148 Modeling Palm Oil Quality During the Ripening Process of Fresh Fruits

Authors: Afshin Keshvadi, Johari Endan, Haniff Harun, Desa Ahmad, Farah Saleena

Abstract:

Experiments were conducted to develop a model for analyzing the ripening process of oil palm fresh fruits in relation to oil yield and oil quality of palm oil produced. This research was carried out on 8-year-old Tenera (Dura × Pisifera) palms planted in 2003 at the Malaysian Palm Oil Board Research Station. Fresh fruit bunches were harvested from designated palms during January till May of 2010. The bunches were divided into three regions (top, middle and bottom), and fruits from the outer and inner layers were randomly sampled for analysis at 8, 12, 16 and 20 weeks after anthesis to establish relationships between maturity and oil development in the mesocarp and kernel. Computations on data related to ripening time, oil content and oil quality were performed using several computer software programs (MSTAT-C, SAS and Microsoft Excel). Nine nonlinear mathematical models were utilized using MATLAB software to fit the data collected. The results showed mean mesocarp oil percent increased from 1.24 % at 8 weeks after anthesis to 29.6 % at 20 weeks after anthesis. Fruits from the top part of the bunch had the highest mesocarp oil content of 10.09 %. The lowest kernel oil percent of 0.03 % was recorded at 12 weeks after anthesis. Palmitic acid and oleic acid comprised of more than 73 % of total mesocarp fatty acids at 8 weeks after anthesis, and increased to more than 80 % at fruit maturity at 20 weeks. The Logistic model with the highest R2 and the lowest root mean square error was found to be the best fit model.

Keywords: oil palm, oil yield, ripening process, anthesis, fatty acids, modeling

Procedia PDF Downloads 281
147 Physics-Informed Convolutional Neural Networks for Reservoir Simulation

Authors: Jiangxia Han, Liang Xue, Keda Chen

Abstract:

Despite the significant progress over the last decades in reservoir simulation using numerical discretization, meshing is complex. Moreover, the high degree of freedom of the space-time flow field makes the solution process very time-consuming. Therefore, we present Physics-Informed Convolutional Neural Networks(PICNN) as a hybrid scientific theory and data method for reservoir modeling. Besides labeled data, the model is driven by the scientific theories of the underlying problem, such as governing equations, boundary conditions, and initial conditions. PICNN integrates governing equations and boundary conditions into the network architecture in the form of a customized convolution kernel. The loss function is composed of data matching, initial conditions, and other measurable prior knowledge. By customizing the convolution kernel and minimizing the loss function, the neural network parameters not only fit the data but also honor the governing equation. The PICNN provides a methodology to model and history-match flow and transport problems in porous media. Numerical results demonstrate that the proposed PICNN can provide an accurate physical solution from a limited dataset. We show how this method can be applied in the context of a forward simulation for continuous problems. Furthermore, several complex scenarios are tested, including the existence of data noise, different work schedules, and different good patterns.

Keywords: convolutional neural networks, deep learning, flow and transport in porous media, physics-informed neural networks, reservoir simulation

Procedia PDF Downloads 102
146 Existence of Minimal and Maximal Mild Solutions for Non-Local in Time Subdiffusion Equations of Neutral Type

Authors: Jorge Gonzalez-Camus

Abstract:

In this work is proved the existence of at least one minimal and maximal mild solutions to the Cauchy problem, for fractional evolution equation of neutral type, involving a general kernel. An operator A generating a resolvent family and integral resolvent family on a Banach space X and a kernel belonging to a large class appears in the equation, which covers many relevant cases from physics applications, in particular, the important case of time - fractional evolution equations of neutral type. The main tool used in this work was the Kuratowski measure of noncompactness and fixed point theorems, specifically Darbo-type, and an iterative method of lower and upper solutions, based in an order in X induced by a normal cone P. Initially, the equation is a Cauchy problem, involving a fractional derivate in Caputo sense. Then, is formulated the equivalent integral version, and defining a convenient functional, using the theory of resolvent families, and verifying the hypothesis of the fixed point theorem of Darbo type, give us the existence of mild solution for the initial problem. Furthermore, the existence of minimal and maximal mild solutions was proved through in an iterative method of lower and upper solutions, using the Azcoli-Arzela Theorem, and the Gronwall’s inequality. Finally, we recovered the case derivate in Caputo sense.

Keywords: fractional evolution equations, Volterra integral equations, minimal and maximal mild solutions, neutral type equations, non-local in time equations

Procedia PDF Downloads 143
145 Combustion and Emissions Performance of Syngas Fuels Derived from Palm Kernel Shell and Polyethylene (PE) Waste via Catalytic Steam Gasification

Authors: Chaouki Ghenai

Abstract:

Computational fluid dynamics analysis of the burning of syngas fuels derived from biomass and plastic solid waste mixture through gasification process is presented in this paper. The syngas fuel is burned in gas turbine can combustor. Gas turbine can combustor with swirl is designed to burn the fuel efficiently and reduce the emissions. The main objective is to test the impact of the alternative syngas fuel compositions and lower heating value on the combustion performance and emissions. The syngas fuel is produced by blending Palm Kernel Shell (PKS) with Polyethylene (PE) waste via catalytic steam gasification (fluidized bed reactor). High hydrogen content syngas fuel was obtained by mixing 30% PE waste with PKS. The syngas composition obtained through the gasification process is 76.2% H2, 8.53% CO, 4.39% CO2 and 10.90% CH4. The lower heating value of the syngas fuel is LHV = 15.98 MJ/m3. Three fuels were tested in this study natural gas (100%CH4), syngas fuel and pure hydrogen (100% H2). The power from the combustor was kept constant for all the fuels tested in this study. The effect of syngas fuel composition and lower heating value on the flame shape, gas temperature, mass of carbon dioxide (CO2) and nitrogen oxides (NOX) per unit of energy generation is presented in this paper. The results show an increase of the peak flame temperature and NO mass fractions for the syngas and hydrogen fuels compared to natural gas fuel combustion. Lower average CO2 emissions at the exit of the combustor are obtained for the syngas compared to the natural gas fuel.

Keywords: CFD, combustion, emissions, gas turbine combustor, gasification, solid waste, syngas, waste to energy

Procedia PDF Downloads 565
144 Analysis of Factors Affecting the Number of Infant and Maternal Mortality in East Java with Geographically Weighted Bivariate Generalized Poisson Regression Method

Authors: Luh Eka Suryani, Purhadi

Abstract:

Poisson regression is a non-linear regression model with response variable in the form of count data that follows Poisson distribution. Modeling for a pair of count data that show high correlation can be analyzed by Poisson Bivariate Regression. Data, the number of infant mortality and maternal mortality, are count data that can be analyzed by Poisson Bivariate Regression. The Poisson regression assumption is an equidispersion where the mean and variance values are equal. However, the actual count data has a variance value which can be greater or less than the mean value (overdispersion and underdispersion). Violations of this assumption can be overcome by applying Generalized Poisson Regression. Characteristics of each regency can affect the number of cases occurred. This issue can be overcome by spatial analysis called geographically weighted regression. This study analyzes the number of infant mortality and maternal mortality based on conditions in East Java in 2016 using Geographically Weighted Bivariate Generalized Poisson Regression (GWBGPR) method. Modeling is done with adaptive bisquare Kernel weighting which produces 3 regency groups based on infant mortality rate and 5 regency groups based on maternal mortality rate. Variables that significantly influence the number of infant and maternal mortality are the percentages of pregnant women visit health workers at least 4 times during pregnancy, pregnant women get Fe3 tablets, obstetric complication handled, clean household and healthy behavior, and married women with the first marriage age under 18 years.

Keywords: adaptive bisquare kernel, GWBGPR, infant mortality, maternal mortality, overdispersion

Procedia PDF Downloads 131
143 A Nonlinear Feature Selection Method for Hyperspectral Image Classification

Authors: Pei-Jyun Hsieh, Cheng-Hsuan Li, Bor-Chen Kuo

Abstract:

For hyperspectral image classification, feature reduction is an important pre-processing for avoiding the Hughes phenomena due to the difficulty for collecting training samples. Hence, lots of researches developed feature selection methods such as F-score, HSIC (Hilbert-Schmidt Independence Criterion), and etc., to improve hyperspectral image classification. However, most of them only consider the class separability in the original space, i.e., a linear class separability. In this study, we proposed a nonlinear class separability measure based on kernel trick for selecting an appropriate feature subset. The proposed nonlinear class separability was formed by a generalized RBF kernel with different bandwidths with respect to different features. Moreover, it considered the within-class separability and the between-class separability. A genetic algorithm was applied to tune these bandwidths such that the smallest with-class separability and the largest between-class separability simultaneously. This indicates the corresponding feature space is more suitable for classification. In addition, the corresponding nonlinear classification boundary can separate classes very well. These optimal bandwidths also show the importance of bands for hyperspectral image classification. The reciprocals of these bandwidths can be viewed as weights of bands. The smaller bandwidth, the larger weight of the band, and the more importance for classification. Hence, the descending order of the reciprocals of the bands gives an order for selecting the appropriate feature subsets. In the experiments, three hyperspectral image data sets, the Indian Pine Site data set, the PAVIA data set, and the Salinas A data set, were used to demonstrate the selected feature subsets by the proposed nonlinear feature selection method are more appropriate for hyperspectral image classification. Only ten percent of samples were randomly selected to form the training dataset. All non-background samples were used to form the testing dataset. The support vector machine was applied to classify these testing samples based on selected feature subsets. According to the experiments on the Indian Pine Site data set with 220 bands, the highest accuracies by applying the proposed method, F-score, and HSIC are 0.8795, 0.8795, and 0.87404, respectively. However, the proposed method selects 158 features. F-score and HSIC select 168 features and 217 features, respectively. Moreover, the classification accuracies increase dramatically only using first few features. The classification accuracies with respect to feature subsets of 10 features, 20 features, 50 features, and 110 features are 0.69587, 0.7348, 0.79217, and 0.84164, respectively. Furthermore, only using half selected features (110 features) of the proposed method, the corresponding classification accuracy (0.84168) is approximate to the highest classification accuracy, 0.8795. For other two hyperspectral image data sets, the PAVIA data set and Salinas A data set, we can obtain the similar results. These results illustrate our proposed method can efficiently find feature subsets to improve hyperspectral image classification. One can apply the proposed method to determine the suitable feature subset first according to specific purposes. Then researchers can only use the corresponding sensors to obtain the hyperspectral image and classify the samples. This can not only improve the classification performance but also reduce the cost for obtaining hyperspectral images.

Keywords: hyperspectral image classification, nonlinear feature selection, kernel trick, support vector machine

Procedia PDF Downloads 240
142 Rough Oscillatory Singular Integrals on Rⁿ

Authors: H. M. Al-Qassem, L. Cheng, Y. Pan

Abstract:

In this paper we establish sharp bounds for oscillatory singular integrals with an arbitrary real polynomial phase P. Our kernels are allowed to be rough both on the unit sphere and in the radial direction. We show that the bounds grow no faster than log(deg(P)), which is optimal and was first obtained by Parissis and Papadimitrakis for kernels without any radial roughness. Among key ingredients of our methods are an L¹→L² estimate and extrapolation.

Keywords: oscillatory singular integral, rough kernel, singular integral, Orlicz spaces, Block spaces, extrapolation, L^{p} boundedness

Procedia PDF Downloads 328
141 Fuglede-Putnam Theorem for ∗-Class A Operators

Authors: Mohammed Husein Mohammad Rashid

Abstract:

For a bounded linear operator T acting on a complex infinite dimensional Hilbert space ℋ, we say that T is ∗-class A operator (abbreviation T∈A*) if |T²|≥ |T*|². In this article, we prove the following assertions:(i) we establish some conditions which imply the normality of ∗-class A; (ii) we consider ∗-class A operator T ∈ ℬ(ℋ) with reducing kernel such that TX = XS for some X ∈ ℬ(K, ℋ) and prove the Fuglede-Putnam type theorem when adjoint of S ∈ ℬ(K) is dominant operators; (iii) furthermore, we extend the asymmetric Putnam-Fuglede theorem the class of ∗-class A operators.

Keywords: fuglede-putnam theorem, normal operators, ∗-class a operators, dominant operators

Procedia PDF Downloads 57
140 Lean Comic GAN (LC-GAN): a Light-Weight GAN Architecture Leveraging Factorized Convolution and Teacher Forcing Distillation Style Loss Aimed to Capture Two Dimensional Animated Filtered Still Shots Using Mobile Phone Camera and Edge Devices

Authors: Kaustav Mukherjee

Abstract:

In this paper we propose a Neural Style Transfer solution whereby we have created a Lightweight Separable Convolution Kernel Based GAN Architecture (SC-GAN) which will very useful for designing filter for Mobile Phone Cameras and also Edge Devices which will convert any image to its 2D ANIMATED COMIC STYLE Movies like HEMAN, SUPERMAN, JUNGLE-BOOK. This will help the 2D animation artist by relieving to create new characters from real life person's images without having to go for endless hours of manual labour drawing each and every pose of a cartoon. It can even be used to create scenes from real life images.This will reduce a huge amount of turn around time to make 2D animated movies and decrease cost in terms of manpower and time. In addition to that being extreme light-weight it can be used as camera filters capable of taking Comic Style Shots using mobile phone camera or edge device cameras like Raspberry Pi 4,NVIDIA Jetson NANO etc. Existing Methods like CartoonGAN with the model size close to 170 MB is too heavy weight for mobile phones and edge devices due to their scarcity in resources. Compared to the current state of the art our proposed method which has a total model size of 31 MB which clearly makes it ideal and ultra-efficient for designing of camera filters on low resource devices like mobile phones, tablets and edge devices running OS or RTOS. .Owing to use of high resolution input and usage of bigger convolution kernel size it produces richer resolution Comic-Style Pictures implementation with 6 times lesser number of parameters and with just 25 extra epoch trained on a dataset of less than 1000 which breaks the myth that all GAN need mammoth amount of data. Our network reduces the density of the Gan architecture by using Depthwise Separable Convolution which does the convolution operation on each of the RGB channels separately then we use a Point-Wise Convolution to bring back the network into required channel number using 1 by 1 kernel.This reduces the number of parameters substantially and makes it extreme light-weight and suitable for mobile phones and edge devices. The architecture mentioned in the present paper make use of Parameterised Batch Normalization Goodfellow etc al. (Deep Learning OPTIMIZATION FOR TRAINING DEEP MODELS page 320) which makes the network to use the advantage of Batch Norm for easier training while maintaining the non-linear feature capture by inducing the learnable parameters

Keywords: comic stylisation from camera image using GAN, creating 2D animated movie style custom stickers from images, depth-wise separable convolutional neural network for light-weight GAN architecture for EDGE devices, GAN architecture for 2D animated cartoonizing neural style, neural style transfer for edge, model distilation, perceptual loss

Procedia PDF Downloads 97
139 Non-Differentiable Mond-Weir Type Symmetric Duality under Generalized Invexity

Authors: Jai Prakash Verma, Khushboo Verma

Abstract:

In the present paper, a pair of Mond-Weir type non-differentiable multiobjective second-order programming problems, involving two kernel functions, where each of the objective functions contains support function, is formulated. We prove weak, strong and converse duality theorem for the second-order symmetric dual programs under η-pseudoinvexity conditions.

Keywords: non-differentiable multiobjective programming, second-order symmetric duality, efficiency, support function, eta-pseudoinvexity

Procedia PDF Downloads 223
138 Sharp Estimates of Oscillatory Singular Integrals with Rough Kernels

Authors: H. Al-Qassem, L. Cheng, Y. Pan

Abstract:

In this paper, we establish sharp bounds for oscillatory singular integrals with an arbitrary real polynomial phase P. Our kernels are allowed to be rough both on the unit sphere and in the radial direction. We show that the bounds grow no faster than log (deg(P)), which is optimal and was first obtained by Parissis and Papadimitrakis for kernels without any radial roughness. Our results substantially improve many previously known results. Among key ingredients of our methods are an L¹→L² sharp estimate and using extrapolation.

Keywords: oscillatory singular integral, rough kernel, singular integral, orlicz spaces, block spaces, extrapolation, L^{p} boundedness

Procedia PDF Downloads 433
137 Superconvergence of the Iterated Discrete Legendre Galerkin Method for Fredholm-Hammerstein Equations

Authors: Payel Das, Gnaneshwar Nelakanti

Abstract:

In this paper we analyse the iterated discrete Legendre Galerkin method for Fredholm-Hammerstein integral equations with smooth kernel. Using sufficiently accurate numerical quadrature rule, we obtain superconvergence rates for the iterated discrete Legendre Galerkin solutions in both infinity and $L^2$-norm. Numerical examples are given to illustrate the theoretical results.

Keywords: hammerstein integral equations, spectral method, discrete galerkin, numerical quadrature, superconvergence

Procedia PDF Downloads 446
136 Multi-Channel Information Fusion in C-OTDR Monitoring Systems: Various Approaches to Classify of Targeted Events

Authors: Andrey V. Timofeev

Abstract:

The paper presents new results concerning selection of optimal information fusion formula for ensembles of C-OTDR channels. The goal of information fusion is to create an integral classificator designed for effective classification of seismoacoustic target events. The LPBoost (LP-β and LP-B variants), the Multiple Kernel Learning, and Weighing of Inversely as Lipschitz Constants (WILC) approaches were compared. The WILC is a brand new approach to optimal fusion of Lipschitz Classifiers Ensembles. Results of practical usage are presented.

Keywords: Lipschitz Classifier, classifiers ensembles, LPBoost, C-OTDR systems

Procedia PDF Downloads 430
135 Parallel Fuzzy Rough Support Vector Machine for Data Classification in Cloud Environment

Authors: Arindam Chaudhuri

Abstract:

Classification of data has been actively used for most effective and efficient means of conveying knowledge and information to users. The prima face has always been upon techniques for extracting useful knowledge from data such that returns are maximized. With emergence of huge datasets the existing classification techniques often fail to produce desirable results. The challenge lies in analyzing and understanding characteristics of massive data sets by retrieving useful geometric and statistical patterns. We propose a supervised parallel fuzzy rough support vector machine (PFRSVM) for data classification in cloud environment. The classification is performed by PFRSVM using hyperbolic tangent kernel. The fuzzy rough set model takes care of sensitiveness of noisy samples and handles impreciseness in training samples bringing robustness to results. The membership function is function of center and radius of each class in feature space and is represented with kernel. It plays an important role towards sampling the decision surface. The success of PFRSVM is governed by choosing appropriate parameter values. The training samples are either linear or nonlinear separable. The different input points make unique contributions to decision surface. The algorithm is parallelized with a view to reduce training times. The system is built on support vector machine library using Hadoop implementation of MapReduce. The algorithm is tested on large data sets to check its feasibility and convergence. The performance of classifier is also assessed in terms of number of support vectors. The challenges encountered towards implementing big data classification in machine learning frameworks are also discussed. The experiments are done on the cloud environment available at University of Technology and Management, India. The results are illustrated for Gaussian RBF and Bayesian kernels. The effect of variability in prediction and generalization of PFRSVM is examined with respect to values of parameter C. It effectively resolves outliers’ effects, imbalance and overlapping class problems, normalizes to unseen data and relaxes dependency between features and labels. The average classification accuracy for PFRSVM is better than other classifiers for both Gaussian RBF and Bayesian kernels. The experimental results on both synthetic and real data sets clearly demonstrate the superiority of the proposed technique.

Keywords: FRSVM, Hadoop, MapReduce, PFRSVM

Procedia PDF Downloads 461
134 Systematic Evaluation of Convolutional Neural Network on Land Cover Classification from Remotely Sensed Images

Authors: Eiman Kattan, Hong Wei

Abstract:

In using Convolutional Neural Network (CNN) for classification, there is a set of hyperparameters available for the configuration purpose. This study aims to evaluate the impact of a range of parameters in CNN architecture i.e. AlexNet on land cover classification based on four remotely sensed datasets. The evaluation tests the influence of a set of hyperparameters on the classification performance. The parameters concerned are epoch values, batch size, and convolutional filter size against input image size. Thus, a set of experiments were conducted to specify the effectiveness of the selected parameters using two implementing approaches, named pertained and fine-tuned. We first explore the number of epochs under several selected batch size values (32, 64, 128 and 200). The impact of kernel size of convolutional filters (1, 3, 5, 7, 10, 15, 20, 25 and 30) was evaluated against the image size under testing (64, 96, 128, 180 and 224), which gave us insight of the relationship between the size of convolutional filters and image size. To generalise the validation, four remote sensing datasets, AID, RSD, UCMerced and RSCCN, which have different land covers and are publicly available, were used in the experiments. These datasets have a wide diversity of input data, such as number of classes, amount of labelled data, and texture patterns. A specifically designed interactive deep learning GPU training platform for image classification (Nvidia Digit) was employed in the experiments. It has shown efficiency in both training and testing. The results have shown that increasing the number of epochs leads to a higher accuracy rate, as expected. However, the convergence state is highly related to datasets. For the batch size evaluation, it has shown that a larger batch size slightly decreases the classification accuracy compared to a small batch size. For example, selecting the value 32 as the batch size on the RSCCN dataset achieves the accuracy rate of 90.34 % at the 11th epoch while decreasing the epoch value to one makes the accuracy rate drop to 74%. On the other extreme, setting an increased value of batch size to 200 decreases the accuracy rate at the 11th epoch is 86.5%, and 63% when using one epoch only. On the other hand, selecting the kernel size is loosely related to data set. From a practical point of view, the filter size 20 produces 70.4286%. The last performed image size experiment shows a dependency in the accuracy improvement. However, an expensive performance gain had been noticed. The represented conclusion opens the opportunities toward a better classification performance in various applications such as planetary remote sensing.

Keywords: CNNs, hyperparamters, remote sensing, land cover, land use

Procedia PDF Downloads 142
133 Some Quality Parameters of Selected Maize Hybrids from Serbia for the Production of Starch, Bioethanol and Animal Feed

Authors: Marija Milašinović-Šeremešić, Valentina Semenčenko, Milica Radosavljević, Dušanka Terzić, Ljiljana Mojović, Ljubica Dokić

Abstract:

Maize (Zea mays L.) is one of the most important cereal crops, and as such, one of the most significant naturally renewable carbohydrate raw materials for the production of energy and multitude of different products. The main goal of the present study was to investigate a suitability of selected maize hybrids of different genetic background produced in Maize Research Institute ‘Zemun Polje’, Belgrade, Serbia, for starch, bioethanol and animal feed production. All the hybrids are commercial and their detailed characterization is important for the expansion of their different uses. The starches were isolated by using a 100-g laboratory maize wet-milling procedure. Hydrolysis experiments were done in two steps (liquefaction with Termamyl SC, and saccharification with SAN Extra L). Starch hydrolysates obtained by the two-step hydrolysis of the corn flour starch were subjected to fermentation by S. cerevisiae var. ellipsoideus under semi-anaerobic conditions. The digestibility based on enzymatic solubility was performed by the Aufréré method. All investigated ZP maize hybrids had very different physical characteristics and chemical composition which could allow various possibilities of their use. The amount of hard (vitreous) and soft (floury) endosperm in kernel is considered one of the most important parameters that can influence the starch and bioethanol yields. Hybrids with a lower test weight and density and a greater proportion of soft endosperm fraction had a higher yield, recovery and purity of starch. Among the chemical composition parameters only starch content significantly affected the starch yield. Starch yields of studied maize hybrids ranged from 58.8% in ZP 633 to 69.0% in ZP 808. The lowest bioethanol yield of 7.25% w/w was obtained for hybrid ZP 611k and the highest by hybrid ZP 434 (8.96% w/w). A very significant correlation was determined between kernel starch content and the bioethanol yield, as well as volumetric productivity (48h) (r=0.66). Obtained results showed that the NDF, ADF and ADL contents in the whole maize plant of the observed ZP maize hybrids varied from 40.0% to 60.1%, 18.6% to 32.1%, and 1.4% to 3.1%, respectively. The difference in the digestibility of the dry matter of the whole plant among hybrids (ZP 735 and ZP 560) amounted to 18.1%. Moreover, the differences in the contents of the lignocelluloses fraction affected the differences in dry matter digestibility. From the results it can be concluded that genetic background of the selected maize hybrids plays an important part in estimation of the technological value of maize hybrids for various purposes. Obtained results are of an exceptional importance for the breeding programs and selection of potentially most suitable maize hybrids for starch, bioethanol and animal feed production.

Keywords: bioethanol, biomass quality, maize, starch

Procedia PDF Downloads 192
132 Fast and Accurate Finite-Difference Method Solving Multicomponent Smoluchowski Coagulation Equation

Authors: Alexander P. Smirnov, Sergey A. Matveev, Dmitry A. Zheltkov, Eugene E. Tyrtyshnikov

Abstract:

We propose a new computational technique for multidimensional (multicomponent) Smoluchowski coagulation equation. Using low-rank approximations in Tensor Train format of both the solution and the coagulation kernel, we accelerate the classical finite-difference Runge-Kutta scheme keeping its level of accuracy. The complexity of the taken finite-difference scheme is reduced from O(N^2d) to O(d^2 N log N ), where N is the number of grid nodes and d is a dimensionality of the problem. The efficiency and the accuracy of the new method are demonstrated on concrete problem with known analytical solution.

Keywords: tensor train decomposition, multicomponent Smoluchowski equation, runge-kutta scheme, convolution

Procedia PDF Downloads 387
131 Closest Possible Neighbor of a Different Class: Explaining a Model Using a Neighbor Migrating Generator

Authors: Hassan Eshkiki, Benjamin Mora

Abstract:

The Neighbor Migrating Generator is a simple and efficient approach to finding the closest potential neighbor(s) with a different label for a given instance and so without the need to calibrate any kernel settings at all. This allows determining and explaining the most important features that will influence an AI model. It can be used to either migrate a specific sample to the class decision boundary of the original model within a close neighborhood of that sample or identify global features that can help localising neighbor classes. The proposed technique works by minimizing a loss function that is divided into two components which are independently weighted according to three parameters α, β, and ω, α being self-adjusting. Results show that this approach is superior to past techniques when detecting the smallest changes in the feature space and may also point out issues in models like over-fitting.

Keywords: explainable AI, EX AI, feature importance, counterfactual explanations

Procedia PDF Downloads 120
130 Study of Oxidative Stability, Cold Flow Properties and Iodine Value of Macauba Biodiesel Blends

Authors: Acacia A. Salomão, Willian L. Gomes da Silva, Gustavo G. Shimamoto, Matthieu Tubino

Abstract:

Biodiesel physical and chemical properties depend on the raw material composition used in its synthesis. Saturated fatty acid esters confer high oxidative stability, while unsaturated fatty acid esters improve the cold flow properties. In this study, an alternative vegetal source - the macauba kernel oil - was used in the biodiesel synthesis instead of conventional sources. Macauba can be collected from native palm trees and is found in several regions in Brazil. Its oil is a promising source when compared to several other oils commonly obtained from food products, such as soybean, corn or canola oil, due to its specific characteristics. However, the usage of biodiesel made from macauba oil alone is not recommended due to the difficulty of producing macauba in large quantities. For this reason, this project proposes the usage of blends of the macauba oil with conventional oils. These blends were prepared by mixing the macauba biodiesel with biodiesels obtained from soybean, corn, and from residual frying oil, in the following proportions: 20:80, 50:50 e 80:20 (w/w). Three parameters were evaluated, using the standard methods, in order to check the quality of the produced biofuel and its blends: oxidative stability, cold filter plugging point (CFPP), and iodine value. The induction period (IP) expresses the oxidative stability of the biodiesel, the CFPP expresses the lowest temperature in which the biodiesel flows through a filter without plugging the system and the iodine value is a measure of the number of double bonds in a sample. The biodiesels obtained from soybean, residual frying oil and corn presented iodine values higher than 110 g/100 g, low oxidative stability and low CFPP. The IP values obtained from these biodiesels were lower than 8 h, which is below the recommended standard value. On the other hand, the CFPP value was found within the allowed limit (5 ºC is the maximum). Regarding the macauba biodiesel, a low iodine value was observed (31.6 g/100 g), which indicates the presence of high content of saturated fatty acid esters. The presence of saturated fatty acid esters should imply in a high oxidative stability (which was found accordingly, with IP = 64 h), and high CFPP, but curiously the latter was not observed (-3 ºC). This behavior can be explained by looking at the size of the carbon chains, as 65% of this biodiesel is composed by short chain saturated fatty acid esters (less than 14 carbons). The high oxidative stability and the low CFPP of macauba biodiesel are what make this biofuel a promising source. The soybean, corn and residual frying oil biodiesels also have low CFPP, but low oxidative stability. Therefore the blends proposed in this work, if compared to the common biodiesels, maintain the flow properties but present enhanced oxidative stability.

Keywords: biodiesel, blends, macauba kernel oil, stability oxidative

Procedia PDF Downloads 504
129 Structure Clustering for Milestoning Applications of Complex Conformational Transitions

Authors: Amani Tahat, Serdal Kirmizialtin

Abstract:

Trajectory fragment methods such as Markov State Models (MSM), Milestoning (MS) and Transition Path sampling are the prime choice of extending the timescale of all atom Molecular Dynamics simulations. In these approaches, a set of structures that covers the accessible phase space has to be chosen a priori using cluster analysis. Structural clustering serves to partition the conformational state into natural subgroups based on their similarity, an essential statistical methodology that is used for analyzing numerous sets of empirical data produced by Molecular Dynamics (MD) simulations. Local transition kernel among these clusters later used to connect the metastable states using a Markovian kinetic model in MSM and a non-Markovian model in MS. The choice of clustering approach in constructing such kernel is crucial since the high dimensionality of the biomolecular structures might easily confuse the identification of clusters when using the traditional hierarchical clustering methodology. Of particular interest, in the case of MS where the milestones are very close to each other, accurate determination of the milestone identity of the trajectory becomes a challenging issue. Throughout this work we present two cluster analysis methods applied to the cis–trans isomerism of dinucleotide AA. The choice of nucleic acids to commonly used proteins to study the cluster analysis is two fold: i) the energy landscape is rugged; hence transitions are more complex, enabling a more realistic model to study conformational transitions, ii) Nucleic acids conformational space is high dimensional. A diverse set of internal coordinates is necessary to describe the metastable states in nucleic acids, posing a challenge in studying the conformational transitions. Herein, we need improved clustering methods that accurately identify the AA structure in its metastable states in a robust way for a wide range of confused data conditions. The single linkage approach of the hierarchical clustering available in GROMACS MD-package is the first clustering methodology applied to our data. Self Organizing Map (SOM) neural network, that also known as a Kohonen network, is the second data clustering methodology. The performance comparison of the neural network as well as hierarchical clustering method is studied by means of computing the mean first passage times for the cis-trans conformational rates. Our hope is that this study provides insight into the complexities and need in determining the appropriate clustering algorithm for kinetic analysis. Our results can improve the effectiveness of decisions based on clustering confused empirical data in studying conformational transitions in biomolecules.

Keywords: milestoning, self organizing map, single linkage, structure clustering

Procedia PDF Downloads 194
128 Applying a Noise Reduction Method to Reveal Chaos in the River Flow Time Series

Authors: Mohammad H. Fattahi

Abstract:

Chaotic analysis has been performed on the river flow time series before and after applying the wavelet based de-noising techniques in order to investigate the noise content effects on chaotic nature of flow series. In this study, 38 years of monthly runoff data of three gauging stations were used. Gauging stations were located in Ghar-e-Aghaj river basin, Fars province, Iran. The noise level of time series was estimated with the aid of Gaussian kernel algorithm. This step was found to be crucial in preventing removal of the vital data such as memory, correlation and trend from the time series in addition to the noise during de-noising process.

Keywords: chaotic behavior, wavelet, noise reduction, river flow

Procedia PDF Downloads 436
127 Community Based Participatory Research in Opioid Use: Design of an Informatics Solution

Authors: Sue S. Feldman, Bradley Tipper, Benjamin Schooley

Abstract:

Nearly every community in the US has been impacted by opioid related addictions/deaths; it is a national problem that is threatening our social and economic welfare. Most believe that tackling this problem from a prevention perspective advances can be made toward breaking the chain of addiction. One mechanism, community based participatory research, involves the community in the prevention approach. This project combines that approach with a design science approach to develop an integrated solution. Findings suggested accountable care communities, transpersonal psychology, and social exchange theory as product kernel theories. Evaluation was conducted on a prototype.

Keywords: substance use and abuse recovery, community resource centers, accountable care communities, community based participatory research

Procedia PDF Downloads 117
126 A Semi-Analytical Method for Analysis of the Axially Symmetric Problem on Indentation of a Hot Circular Punch into an Arbitrarily Nonhomogeneous Halfspace

Authors: S. Aizikovich, L. Krenev, Y. Tokovyy, Y. C. Wang

Abstract:

An approximate analytical-numerical solution to the axisymmetric problem on thermo-mechanical indentation of a flat cylindrical punch into an arbitrarily non-homogeneous elastic half-space is constructed by making use of the bilateral asymptotic method. The key point of this method lies in evaluation of the ker¬nels in the obtained integral equations by making use of a numerical technique. Once the structure of the kernel is defined, it then is approximated by an analytical expression of special kind so that the solution of the integral equation can be achieved analytically. This fact allows for construction of the solution in an analytical form, which is convenient for analysis of the mechanical effects concerned with arbitrarily presumed non-homogeneity of the material.

Keywords: contact problem, circular punch, arbitrarily-nonhomogeneous halfspace

Procedia PDF Downloads 494
125 Evaluation of Yield and Yield Components of Malaysian Palm Oil Board-Senegal Oil Palm Germplasm Using Multivariate Tools

Authors: Khin Aye Myint, Mohd Rafii Yusop, Mohd Yusoff Abd Samad, Shairul Izan Ramlee, Mohd Din Amiruddin, Zulkifli Yaakub

Abstract:

The narrow base of genetic is the main obstacle of breeding and genetic improvement in oil palm industry. In order to broaden the genetic bases, the Malaysian Palm Oil Board has been extensively collected wild germplasm from its original area of 11 African countries which are Nigeria, Senegal, Gambia, Guinea, Sierra Leone, Ghana, Cameroon, Zaire, Angola, Madagascar, and Tanzania. The germplasm collections were established and maintained as a field gene bank in Malaysian Palm Oil Board (MPOB) Research Station in Kluang, Johor, Malaysia to conserve a wide range of oil palm genetic resources for genetic improvement of Malaysian oil palm industry. Therefore, assessing the performance and genetic diversity of the wild materials is very important for understanding the genetic structure of natural oil palm population and to explore genetic resources. Principal component analysis (PCA) and Cluster analysis are very efficient multivariate tools in the evaluation of genetic variation of germplasm and have been applied in many crops. In this study, eight populations of MPOB-Senegal oil palm germplasm were studied to explore the genetic variation pattern using PCA and cluster analysis. A total of 20 yield and yield component traits were used to analyze PCA and Ward’s clustering using SAS 9.4 version software. The first four principal components which have eigenvalue >1 accounted for 93% of total variation with the value of 44%, 19%, 18% and 12% respectively for each principal component. PC1 showed highest positive correlation with fresh fruit bunch (0.315), bunch number (0.321), oil yield (0.317), kernel yield (0.326), total economic product (0.324), and total oil (0.324) while PC 2 has the largest positive association with oil to wet mesocarp (0.397) and oil to fruit (0.458). The oil palm population were grouped into four distinct clusters based on 20 evaluated traits, this imply that high genetic variation existed in among the germplasm. Cluster 1 contains two populations which are SEN 12 and SEN 10, while cluster 2 has only one population of SEN 3. Cluster 3 consists of three populations which are SEN 4, SEN 6, and SEN 7 while SEN 2 and SEN 5 were grouped in cluster 4. Cluster 4 showed the highest mean value of fresh fruit bunch, bunch number, oil yield, kernel yield, total economic product, and total oil and Cluster 1 was characterized by high oil to wet mesocarp, and oil to fruit. The desired traits that have the largest positive correlation on extracted PCs could be utilized for the improvement of oil palm breeding program. The populations from different clusters with the highest cluster means could be used for hybridization. The information from this study can be utilized for effective conservation and selection of the MPOB-Senegal oil palm germplasm for the future breeding program.

Keywords: cluster analysis, genetic variability, germplasm, oil palm, principal component analysis

Procedia PDF Downloads 140
124 A Bathtub Curve from Nonparametric Model

Authors: Eduardo C. Guardia, Jose W. M. Lima, Afonso H. M. Santos

Abstract:

This paper presents a nonparametric method to obtain the hazard rate “Bathtub curve” for power system components. The model is a mixture of the three known phases of a component life, the decreasing failure rate (DFR), the constant failure rate (CFR) and the increasing failure rate (IFR) represented by three parametric Weibull models. The parameters are obtained from a simultaneous fitting process of the model to the Kernel nonparametric hazard rate curve. From the Weibull parameters and failure rate curves the useful lifetime and the characteristic lifetime were defined. To demonstrate the model the historic time-to-failure of distribution transformers were used as an example. The resulted “Bathtub curve” shows the failure rate for the equipment lifetime which can be applied in economic and replacement decision models.

Keywords: bathtub curve, failure analysis, lifetime estimation, parameter estimation, Weibull distribution

Procedia PDF Downloads 413
123 Balancing and Synchronization Control of a Two Wheel Inverted Pendulum Vehicle

Authors: Shiuh-Jer Huang, Shin-Ham Lee, Sheam-Chyun Lin

Abstract:

A two wheel inverted pendulum (TWIP) vehicle is built with two hub DC motors for motion control evaluation. Arduino Nano micro-processor is chosen as the control kernel for this electric test plant. Accelerometer and gyroscope sensors are built in to measure the tilt angle and angular velocity of the inverted pendulum vehicle. Since the TWIP has significantly hub motor dead zone and nonlinear system dynamics characteristics, the vehicle system is difficult to control by traditional model based controller. The intelligent model-free fuzzy sliding mode controller (FSMC) was employed as the main control algorithm. Then, intelligent controllers are designed for TWIP balance control, and two wheels synchronization control purposes.

Keywords: balance control, synchronization control, two-wheel inverted pendulum, TWIP

Procedia PDF Downloads 363