Search results for: feature selection methods
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 17415

Search results for: feature selection methods

17235 A Hybrid System for Boreholes Soil Sample

Authors: Ali Ulvi Uzer

Abstract:

Data reduction is an important topic in the field of pattern recognition applications. The basic concept is the reduction of multitudinous amounts of data down to the meaningful parts. The Principal Component Analysis (PCA) method is frequently used for data reduction. The Support Vector Machine (SVM) method is a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data, the algorithm outputs an optimal hyperplane which categorizes new examples. This study offers a hybrid approach that uses the PCA for data reduction and Support Vector Machines (SVM) for classification. In order to detect the accuracy of the suggested system, two boreholes taken from the soil sample was used. The classification accuracies for this dataset were obtained through using ten-fold cross-validation method. As the results suggest, this system, which is performed through size reduction, is a feasible system for faster recognition of dataset so our study result appears to be very promising.

Keywords: feature selection, sequential forward selection, support vector machines, soil sample

Procedia PDF Downloads 425
17234 Automatic Moment-Based Texture Segmentation

Authors: Tudor Barbu

Abstract:

An automatic moment-based texture segmentation approach is proposed in this paper. First, we describe the related work in this computer vision domain. Our texture feature extraction, the first part of the texture recognition process, produces a set of moment-based feature vectors. For each image pixel, a texture feature vector is computed as a sequence of area moments. Second, an automatic pixel classification approach is proposed. The feature vectors are clustered using some unsupervised classification algorithm, the optimal number of clusters being determined using a measure based on validation indexes. From the resulted pixel classes one determines easily the desired texture regions of the image.

Keywords: image segmentation, moment-based, texture analysis, automatic classification, validation indexes

Procedia PDF Downloads 388
17233 An Automated System for the Detection of Citrus Greening Disease Based on Visual Descriptors

Authors: Sidra Naeem, Ayesha Naeem, Sahar Rahim, Nadia Nawaz Qadri

Abstract:

Citrus greening is a bacterial disease that causes considerable damage to citrus fruits worldwide. Efficient method for this disease detection must be carried out to minimize the production loss. This paper presents a pattern recognition system that comprises three stages for the detection of citrus greening from Orange leaves: segmentation, feature extraction and classification. Image segmentation is accomplished by adaptive thresholding. The feature extraction stage comprises of three visual descriptors i.e. shape, color and texture. From shape feature we have used asymmetry index, from color feature we have used histogram of Cb component from YCbCr domain and from texture feature we have used local binary pattern. Classification was done using support vector machines and k nearest neighbors. The best performances of the system is Accuracy = 88.02% and AUROC = 90.1% was achieved by automatic segmented images. Our experiments validate that: (1). Segmentation is an imperative preprocessing step for computer assisted diagnosis of citrus greening, and (2). The combination of shape, color and texture features form a complementary set towards the identification of citrus greening disease.

Keywords: citrus greening, pattern recognition, feature extraction, classification

Procedia PDF Downloads 144
17232 Selection of Solid Waste Landfill Site Using Geographical Information System (GIS)

Authors: Fatih Iscan, Ceren Yagci

Abstract:

Rapid population growth, urbanization and industrialization are known as the most important factors of environment problems. Elimination and management of solid wastes are also within the most important environment problems. One of the main problems in solid waste management is the selection of the best site for elimination of solid wastes. Lately, Geographical Information System (GIS) has been used for easing selection of landfill area. GIS has the ability of imitating necessary economical, environmental and political limitations. They play an important role for the site selection of landfill area as a decision support tool. In this study; map layers will be studied for minimum effect of environmental, social and cultural factors and maximum effect for engineering/economical factors for site selection of landfill areas and using GIS for an decision support mechanism in solid waste landfill areas site selection will be presented in Aksaray/TURKEY city, Güzelyurt district practice.

Keywords: GIS, landfill, solid waste, spatial analysis

Procedia PDF Downloads 332
17231 Detection and Classification of Mammogram Images Using Principle Component Analysis and Lazy Classifiers

Authors: Rajkumar Kolangarakandy

Abstract:

Feature extraction and selection is the primary part of any mammogram classification algorithms. The choice of feature, attribute or measurements have an important influence in any classification system. Discrete Wavelet Transformation (DWT) coefficients are one of the prominent features for representing images in frequency domain. The features obtained after the decomposition of the mammogram images using wavelet transformations have higher dimension. Even though the features are higher in dimension, they were highly correlated and redundant in nature. The dimensionality reduction techniques play an important role in selecting the optimum number of features from the higher dimension data, which are highly correlated. PCA is a mathematical tool that reduces the dimensionality of the data while retaining most of the variation in the dataset. In this paper, a multilevel classification of mammogram images using reduced discrete wavelet transformation coefficients and lazy classifiers is proposed. The classification is accomplished in two different levels. In the first level, mammogram ROIs extracted from the dataset is classified as normal and abnormal types. In the second level, all the abnormal mammogram ROIs is classified into benign and malignant too. A further classification is also accomplished based on the variation in structure and intensity distribution of the images in the dataset. The Lazy classifiers called Kstar, IBL and LWL are used for classification. The classification results obtained with the reduced feature set is highly promising and the result is also compared with the performance obtained without dimension reduction.

Keywords: PCA, wavelet transformation, lazy classifiers, Kstar, IBL, LWL

Procedia PDF Downloads 314
17230 Automatic Multi-Label Image Annotation System Guided by Firefly Algorithm and Bayesian Method

Authors: Saad M. Darwish, Mohamed A. El-Iskandarani, Guitar M. Shawkat

Abstract:

Nowadays, the amount of available multimedia data is continuously on the rise. The need to find a required image for an ordinary user is a challenging task. Content based image retrieval (CBIR) computes relevance based on the visual similarity of low-level image features such as color, textures, etc. However, there is a gap between low-level visual features and semantic meanings required by applications. The typical method of bridging the semantic gap is through the automatic image annotation (AIA) that extracts semantic features using machine learning techniques. In this paper, a multi-label image annotation system guided by Firefly and Bayesian method is proposed. Firstly, images are segmented using the maximum variance intra cluster and Firefly algorithm, which is a swarm-based approach with high convergence speed, less computation rate and search for the optimal multiple threshold. Feature extraction techniques based on color features and region properties are applied to obtain the representative features. After that, the images are annotated using translation model based on the Net Bayes system, which is efficient for multi-label learning with high precision and less complexity. Experiments are performed using Corel Database. The results show that the proposed system is better than traditional ones for automatic image annotation and retrieval.

Keywords: feature extraction, feature selection, image annotation, classification

Procedia PDF Downloads 559
17229 Artificial Bee Colony Optimization for SNR Maximization through Relay Selection in Underlay Cognitive Radio Networks

Authors: Babar Sultan, Kiran Sultan, Waseem Khan, Ijaz Mansoor Qureshi

Abstract:

In this paper, a novel idea for the performance enhancement of secondary network is proposed for Underlay Cognitive Radio Networks (CRNs). In Underlay CRNs, primary users (PUs) impose strict interference constraints on the secondary users (SUs). The proposed scheme is based on Artificial Bee Colony (ABC) optimization for relay selection and power allocation to handle the highlighted primary challenge of Underlay CRNs. ABC is a simple, population-based optimization algorithm which attains global optimum solution by combining local search methods (Employed and Onlooker Bees) and global search methods (Scout Bees). The proposed two-phase relay selection and power allocation algorithm aims to maximize the signal-to-noise ratio (SNR) at the destination while operating in an underlying mode. The proposed algorithm has less computational complexity and its performance is verified through simulation results for a different number of potential relays, different interference threshold levels and different transmit power thresholds for the selected relays.

Keywords: artificial bee colony, underlay spectrum sharing, cognitive radio networks, amplify-and-forward

Procedia PDF Downloads 550
17228 Local Directional Encoded Derivative Binary Pattern Based Coral Image Classification Using Weighted Distance Gray Wolf Optimization Algorithm

Authors: Annalakshmi G., Sakthivel Murugan S.

Abstract:

This paper presents a local directional encoded derivative binary pattern (LDEDBP) feature extraction method that can be applied for the classification of submarine coral reef images. The classification of coral reef images using texture features is difficult due to the dissimilarities in class samples. In coral reef image classification, texture features are extracted using the proposed method called local directional encoded derivative binary pattern (LDEDBP). The proposed approach extracts the complete structural arrangement of the local region using local binary batten (LBP) and also extracts the edge information using local directional pattern (LDP) from the edge response available in a particular region, thereby achieving extra discriminative feature value. Typically the LDP extracts the edge details in all eight directions. The process of integrating edge responses along with the local binary pattern achieves a more robust texture descriptor than the other descriptors used in texture feature extraction methods. Finally, the proposed technique is applied to an extreme learning machine (ELM) method with a meta-heuristic algorithm known as weighted distance grey wolf optimizer (GWO) to optimize the input weight and biases of single-hidden-layer feed-forward neural networks (SLFN). In the empirical results, ELM-WDGWO demonstrated their better performance in terms of accuracy on all coral datasets, namely RSMAS, EILAT, EILAT2, and MLC, compared with other state-of-the-art algorithms. The proposed method achieves the highest overall classification accuracy of 94% compared to the other state of art methods.

Keywords: feature extraction, local directional pattern, ELM classifier, GWO optimization

Procedia PDF Downloads 138
17227 Using New Machine Algorithms to Classify Iranian Musical Instruments According to Temporal, Spectral and Coefficient Features

Authors: Ronak Khosravi, Mahmood Abbasi Layegh, Siamak Haghipour, Avin Esmaili

Abstract:

In this paper, a study on classification of musical woodwind instruments using a small set of features selected from a broad range of extracted ones by the sequential forward selection method was carried out. Firstly, we extract 42 features for each record in the music database of 402 sound files belonging to five different groups of Flutes (end blown and internal duct), Single –reed, Double –reed (exposed and capped), Triple reed and Quadruple reed. Then, the sequential forward selection method is adopted to choose the best feature set in order to achieve very high classification accuracy. Two different classification techniques of support vector machines and relevance vector machines have been tested out and an accuracy of up to 96% can be achieved by using 21 time, frequency and coefficient features and relevance vector machine with the Gaussian kernel function.

Keywords: coefficient features, relevance vector machines, spectral features, support vector machines, temporal features

Procedia PDF Downloads 288
17226 Music Genre Classification Based on Non-Negative Matrix Factorization Features

Authors: Soyon Kim, Edward Kim

Abstract:

In order to retrieve information from the massive stream of songs in the music industry, music search by title, lyrics, artist, mood, and genre has become more important. Despite the subjectivity and controversy over the definition of music genres across different nations and cultures, automatic genre classification systems that facilitate the process of music categorization have been developed. Manual genre selection by music producers is being provided as statistical data for designing automatic genre classification systems. In this paper, an automatic music genre classification system utilizing non-negative matrix factorization (NMF) is proposed. Short-term characteristics of the music signal can be captured based on the timbre features such as mel-frequency cepstral coefficient (MFCC), decorrelated filter bank (DFB), octave-based spectral contrast (OSC), and octave band sum (OBS). Long-term time-varying characteristics of the music signal can be summarized with (1) the statistical features such as mean, variance, minimum, and maximum of the timbre features and (2) the modulation spectrum features such as spectral flatness measure, spectral crest measure, spectral peak, spectral valley, and spectral contrast of the timbre features. Not only these conventional basic long-term feature vectors, but also NMF based feature vectors are proposed to be used together for genre classification. In the training stage, NMF basis vectors were extracted for each genre class. The NMF features were calculated in the log spectral magnitude domain (NMF-LSM) as well as in the basic feature vector domain (NMF-BFV). For NMF-LSM, an entire full band spectrum was used. However, for NMF-BFV, only low band spectrum was used since high frequency modulation spectrum of the basic feature vectors did not contain important information for genre classification. In the test stage, using the set of pre-trained NMF basis vectors, the genre classification system extracted the NMF weighting values of each genre as the NMF feature vectors. A support vector machine (SVM) was used as a classifier. The GTZAN multi-genre music database was used for training and testing. It is composed of 10 genres and 100 songs for each genre. To increase the reliability of the experiments, 10-fold cross validation was used. For a given input song, an extracted NMF-LSM feature vector was composed of 10 weighting values that corresponded to the classification probabilities for 10 genres. An NMF-BFV feature vector also had a dimensionality of 10. Combined with the basic long-term features such as statistical features and modulation spectrum features, the NMF features provided the increased accuracy with a slight increase in feature dimensionality. The conventional basic features by themselves yielded 84.0% accuracy, but the basic features with NMF-LSM and NMF-BFV provided 85.1% and 84.2% accuracy, respectively. The basic features required dimensionality of 460, but NMF-LSM and NMF-BFV required dimensionalities of 10 and 10, respectively. Combining the basic features, NMF-LSM and NMF-BFV together with the SVM with a radial basis function (RBF) kernel produced the significantly higher classification accuracy of 88.3% with a feature dimensionality of 480.

Keywords: mel-frequency cepstral coefficient (MFCC), music genre classification, non-negative matrix factorization (NMF), support vector machine (SVM)

Procedia PDF Downloads 266
17225 One-Class Support Vector Machine for Sentiment Analysis of Movie Review Documents

Authors: Chothmal, Basant Agarwal

Abstract:

Sentiment analysis means to classify a given review document into positive or negative polar document. Sentiment analysis research has been increased tremendously in recent times due to its large number of applications in the industry and academia. Sentiment analysis models can be used to determine the opinion of the user towards any entity or product. E-commerce companies can use sentiment analysis model to improve their products on the basis of users’ opinion. In this paper, we propose a new One-class Support Vector Machine (One-class SVM) based sentiment analysis model for movie review documents. In the proposed approach, we initially extract features from one class of documents, and further test the given documents with the one-class SVM model if a given new test document lies in the model or it is an outlier. Experimental results show the effectiveness of the proposed sentiment analysis model.

Keywords: feature selection methods, machine learning, NB, one-class SVM, sentiment analysis, support vector machine

Procedia PDF Downloads 482
17224 Investigation of the Main Trends of Tourist Expenses in Georgia

Authors: Nino Abesadze, Marine Mindorashvili, Nino Paresashvili

Abstract:

The main purpose of the article is to make complex statistical analysis of tourist expenses of foreign visitors. We used mixed technique of selection that implies rules of random and proportional selection. Computer software SPSS was used to compute statistical data for corresponding analysis. Corresponding methodology of tourism statistics was implemented according to international standards. Important information was collected and grouped from the major Georgian airports. Techniques of statistical observation were prepared. A representative population of foreign visitors and a rule of selection of respondents were determined. We have a trend of growth of tourist numbers and share of tourists from post-soviet countries constantly increases. Level of satisfaction with tourist facilities and quality of service has grown, but still we have a problem of disparity between quality of service and prices. The design of tourist expenses of foreign visitors is diverse; competitiveness of tourist products of Georgian tourist companies is higher.

Keywords: tourist, expenses, methods, statistics, analysis

Procedia PDF Downloads 315
17223 Optimized Real Ground Motion Scaling for Vulnerability Assessment of Building Considering the Spectral Uncertainty and Shape

Authors: Chen Bo, Wen Zengping

Abstract:

Based on the results of previous studies, we focus on the research of real ground motion selection and scaling method for structural performance-based seismic evaluation using nonlinear dynamic analysis. The input of earthquake ground motion should be determined appropriately to make them compatible with the site-specific hazard level considered. Thus, an optimized selection and scaling method are established including the use of not only Monte Carlo simulation method to create the stochastic simulation spectrum considering the multivariate lognormal distribution of target spectrum, but also a spectral shape parameter. Its applications in structural fragility analysis are demonstrated through case studies. Compared to the previous scheme with no consideration of the uncertainty of target spectrum, the method shown here can make sure that the selected records are in good agreement with the median value, standard deviation and spectral correction of the target spectrum, and greatly reveal the uncertainty feature of site-specific hazard level. Meanwhile, it can help improve computational efficiency and matching accuracy. Given the important infection of target spectrum’s uncertainty on structural seismic fragility analysis, this work can provide the reasonable and reliable basis for structural seismic evaluation under scenario earthquake environment.

Keywords: ground motion selection, scaling method, seismic fragility analysis, spectral shape

Procedia PDF Downloads 266
17222 Feature Extraction Based on Contourlet Transform and Log Gabor Filter for Detection of Ulcers in Wireless Capsule Endoscopy

Authors: Nimisha Elsa Koshy, Varun P. Gopi, V. I. Thajudin Ahamed

Abstract:

The entire visualization of GastroIntestinal (GI) tract is not possible with conventional endoscopic exams. Wireless Capsule Endoscopy (WCE) is a low risk, painless, noninvasive procedure for diagnosing diseases such as bleeding, polyps, ulcers, and Crohns disease within the human digestive tract, especially the small intestine that was unreachable using the traditional endoscopic methods. However, analysis of massive images of WCE detection is tedious and time consuming to physicians. Hence, researchers have developed software methods to detect these diseases automatically. Thus, the effectiveness of WCE can be improved. In this paper, a novel textural feature extraction method is proposed based on Contourlet transform and Log Gabor filter to distinguish ulcer regions from normal regions. The results show that the proposed method performs well with a high accuracy rate of 94.16% using Support Vector Machine (SVM) classifier in HSV colour space.

Keywords: contourlet transform, log gabor filter, ulcer, wireless capsule endoscopy

Procedia PDF Downloads 516
17221 A Theoretical Framework for Conceptualizing Integration of Environmental Sustainability into Supplier Selection

Authors: Tonny Ograh, Joshua Ayarkwa, Dickson Osei-Asibey, Alex Acheampong, Peter Amoah

Abstract:

Theories are used to improve the conceptualization of research ideas. These theories enhance valuable elucidations that help us to grasp the meaning of research findings. Nevertheless, the use of theories to promote studies in green supplier selection in procurement decisions has attracted little attention. With the emergence of sustainable procurement, public procurement practitioners in Ghana are yet to achieve relevant knowledge on green supplier selections due to insufficient knowledge and inadequate appropriate frameworks. The flagrancy of the consequences of public procurers’ failure to integrate environmental considerations into supplier selection explains the adoption of a multi-theory approach for comprehension of the dynamics of green integration into supplier selection. In this paper, the practicality of three theories for improving the understanding of the influential factors enhancing the integration of environmental sustainability into supplier selection was reviewed. The three theories are Resource-Based Theory, Human Capital Theory and Absorptive Capacity Theory. This review uncovered knowledge management, top management commitment, and environmental management capabilities as important elements needed for the integration of environmental sustainability into supplier selection in public procurement. The theoretical review yielded a framework that conceptualizes knowledge and capabilities of practitioners relevant to the incorporation of environmental sustainability into supplier selection in public procurement.

Keywords: environmental, sustainability, supplier selection, environmental procurement, sustainable procurement

Procedia PDF Downloads 149
17220 The Influence of Noise on Aerial Image Semantic Segmentation

Authors: Pengchao Wei, Xiangzhong Fang

Abstract:

Noise is ubiquitous in this world. Denoising is an essential technology, especially in image semantic segmentation, where noises are generally categorized into two main types i.e. feature noise and label noise. The main focus of this paper is aiming at modeling label noise, investigating the behaviors of different types of label noise on image semantic segmentation tasks using K-Nearest-Neighbor and Convolutional Neural Network classifier. The performance without label noise and with is evaluated and illustrated in this paper. In addition to that, the influence of feature noise on the image semantic segmentation task is researched as well and a feature noise reduction method is applied to mitigate its influence in the learning procedure.

Keywords: convolutional neural network, denoising, feature noise, image semantic segmentation, k-nearest-neighbor, label noise

Procedia PDF Downloads 190
17219 Optimized Preprocessing for Accurate and Efficient Bioassay Prediction with Machine Learning Algorithms

Authors: Jeff Clarine, Chang-Shyh Peng, Daisy Sang

Abstract:

Bioassay is the measurement of the potency of a chemical substance by its effect on a living animal or plant tissue. Bioassay data and chemical structures from pharmacokinetic and drug metabolism screening are mined from and housed in multiple databases. Bioassay prediction is calculated accordingly to determine further advancement. This paper proposes a four-step preprocessing of datasets for improving the bioassay predictions. The first step is instance selection in which dataset is categorized into training, testing, and validation sets. The second step is discretization that partitions the data in consideration of accuracy vs. precision. The third step is normalization where data are normalized between 0 and 1 for subsequent machine learning processing. The fourth step is feature selection where key chemical properties and attributes are generated. The streamlined results are then analyzed for the prediction of effectiveness by various machine learning algorithms including Pipeline Pilot, R, Weka, and Excel. Experiments and evaluations reveal the effectiveness of various combination of preprocessing steps and machine learning algorithms in more consistent and accurate prediction.

Keywords: bioassay, machine learning, preprocessing, virtual screen

Procedia PDF Downloads 252
17218 Feature Selection for Production Schedule Optimization in Transition Mines

Authors: Angelina Anani, Ignacio Ortiz Flores, Haitao Li

Abstract:

The use of underground mining methods have increased significantly over the past decades. This increase has also been spared on by several mines transitioning from surface to underground mining. However, determining the transition depth can be a challenging task, especially when coupled with production schedule optimization. Several researchers have simplified the problem by excluding operational features relevant to production schedule optimization. Our research objective is to investigate the extent to which operational features of transition mines accounted for affect the optimal production schedule. We also provide a framework for factors to consider in production schedule optimization for transition mines. An integrated mixed-integer linear programming (MILP) model is developed that maximizes the NPV as a function of production schedule and transition depth. A case study is performed to validate the model, with a comparative sensitivity analysis to obtain operational insights.

Keywords: underground mining, transition mines, mixed-integer linear programming, production schedule

Procedia PDF Downloads 131
17217 Assisted Video Colorization Using Texture Descriptors

Authors: Andre Peres Ramos, Franklin Cesar Flores

Abstract:

Colorization is the process of add colors to a monochromatic image or video. Usually, the process involves to segment the image in regions of interest and then apply colors to each one, for videos, this process is repeated for each frame, which makes it a tedious and time-consuming job. We propose a new assisted method for video colorization; the user only has to colorize one frame, and then the colors are propagated to following frames. The user can intervene at any time to correct eventual errors in color assignment. The method consists of to extract intensity and texture descriptors from the frames and then perform a feature matching to determine the best color for each segment. To reduce computation time and give a better spatial coherence we narrow the area of search and give weights for each feature to emphasize texture descriptors. To give a more natural result, we use an optimization algorithm to make the color propagation. Experimental results in several image sequences, compared to others existing methods, demonstrates that the proposed method perform a better colorization with less time and user interference.

Keywords: colorization, feature matching, texture descriptors, video segmentation

Procedia PDF Downloads 141
17216 Bayesian Variable Selection in Quantile Regression with Application to the Health and Retirement Study

Authors: Priya Kedia, Kiranmoy Das

Abstract:

There is a rich literature on variable selection in regression setting. However, most of these methods assume normality for the response variable under consideration for implementing the methodology and establishing the statistical properties of the estimates. In many real applications, the distribution for the response variable may be non-Gaussian, and one might be interested in finding the best subset of covariates at some predetermined quantile level. We develop dynamic Bayesian approach for variable selection in quantile regression framework. We use a zero-inflated mixture prior for the regression coefficients, and consider the asymmetric Laplace distribution for the response variable for modeling different quantiles of its distribution. An efficient Gibbs sampler is developed for our computation. Our proposed approach is assessed through extensive simulation studies, and real application of the proposed approach is also illustrated. We consider the data from health and retirement study conducted by the University of Michigan, and select the important predictors when the outcome of interest is out-of-pocket medical cost, which is considered as an important measure for financial risk. Our analysis finds important predictors at different quantiles of the outcome, and thus enhance our understanding on the effects of different predictors on the out-of-pocket medical cost.

Keywords: variable selection, quantile regression, Gibbs sampler, asymmetric Laplace distribution

Procedia PDF Downloads 127
17215 Predicting Stack Overflow Accepted Answers Using Features and Models with Varying Degrees of Complexity

Authors: Osayande Pascal Omondiagbe, Sherlock a Licorish

Abstract:

Stack Overflow is a popular community question and answer portal which is used by practitioners to solve technology-related challenges during software development. Previous studies have shown that this forum is becoming a substitute for official software programming languages documentation. While tools have looked to aid developers by presenting interfaces to explore Stack Overflow, developers often face challenges searching through many possible answers to their questions, and this extends the development time. To this end, researchers have provided ways of predicting acceptable Stack Overflow answers by using various modeling techniques. However, less interest is dedicated to examining the performance and quality of typically used modeling methods, and especially in relation to models’ and features’ complexity. Such insights could be of practical significance to the many practitioners that use Stack Overflow. This study examines the performance and quality of various modeling methods that are used for predicting acceptable answers on Stack Overflow, drawn from 2014, 2015 and 2016. Our findings reveal significant differences in models’ performance and quality given the type of features and complexity of models used. Researchers examining classifiers’ performance and quality and features’ complexity may leverage these findings in selecting suitable techniques when developing prediction models.

Keywords: feature selection, modeling and prediction, neural network, random forest, stack overflow

Procedia PDF Downloads 111
17214 Emotion Mining and Attribute Selection for Actionable Recommendations to Improve Customer Satisfaction

Authors: Jaishree Ranganathan, Poonam Rajurkar, Angelina A. Tzacheva, Zbigniew W. Ras

Abstract:

In today’s world, business often depends on the customer feedback and reviews. Sentiment analysis helps identify and extract information about the sentiment or emotion of the of the topic or document. Attribute selection is a challenging problem, especially with large datasets in actionable pattern mining algorithms. Action Rule Mining is one of the methods to discover actionable patterns from data. Action Rules are rules that help describe specific actions to be made in the form of conditions that help achieve the desired outcome. The rules help to change from any undesirable or negative state to a more desirable or positive state. In this paper, we present a Lexicon based weighted scheme approach to identify emotions from customer feedback data in the area of manufacturing business. Also, we use Rough sets and explore the attribute selection method for large scale datasets. Then we apply Actionable pattern mining to extract possible emotion change recommendations. This kind of recommendations help business analyst to improve their customer service which leads to customer satisfaction and increase sales revenue.

Keywords: actionable pattern discovery, attribute selection, business data, data mining, emotion

Procedia PDF Downloads 170
17213 Markowitz and Implementation of a Multi-Objective Evolutionary Technique Applied to the Colombia Stock Exchange (2009-2015)

Authors: Feijoo E. Colomine Duran, Carlos E. Peñaloza Corredor

Abstract:

There modeling component selection financial investment (Portfolio) a variety of problems that can be addressed with optimization techniques under evolutionary schemes. For his feature, the problem of selection of investment components of a dichotomous relationship between two elements that are opposed: The Portfolio Performance and Risk presented by choosing it. This relationship was modeled by Markowitz through a media problem (Performance) - variance (risk), ie must Maximize Performance and Minimize Risk. This research included the study and implementation of multi-objective evolutionary techniques to solve these problems, taking as experimental framework financial market equities Colombia Stock Exchange between 2009-2015. Comparisons three multiobjective evolutionary algorithms, namely the Nondominated Sorting Genetic Algorithm II (NSGA-II), the Strength Pareto Evolutionary Algorithm 2 (SPEA2) and Indicator-Based Selection in Multiobjective Search (IBEA) were performed using two measures well known performance: The Hypervolume indicator and R_2 indicator, also it became a nonparametric statistical analysis and the Wilcoxon rank-sum test. The comparative analysis also includes an evaluation of the financial efficiency of the investment portfolio chosen by the implementation of various algorithms through the Sharpe ratio. It is shown that the portfolio provided by the implementation of the algorithms mentioned above is very well located between the different stock indices provided by the Colombia Stock Exchange.

Keywords: finance, optimization, portfolio, Markowitz, evolutionary algorithms

Procedia PDF Downloads 270
17212 Parameter Selection for Computationally Efficient Use of the Bfvrns Fully Homomorphic Encryption Scheme

Authors: Cavidan Yakupoglu, Kurt Rohloff

Abstract:

In this study, we aim to provide a novel parameter selection model for the BFVrns scheme, which is one of the prominent FHE schemes. Parameter selection in lattice-based FHE schemes is a practical challenges for experts or non-experts. Towards a solution to this problem, we introduce a hybrid principles-based approach that combines theoretical with experimental analyses. To begin, we use regression analysis to examine the parameters on the performance and security. The fact that the FHE parameters induce different behaviors on performance, security and Ciphertext Expansion Factor (CEF) that makes the process of parameter selection more challenging. To address this issue, We use a multi-objective optimization algorithm to select the optimum parameter set for performance, CEF and security at the same time. As a result of this optimization, we get an improved parameter set for better performance at a given security level by ensuring correctness and security against lattice attacks by providing at least 128-bit security. Our result enables average ~ 5x smaller CEF and mostly better performance in comparison to the parameter sets given in [1]. This approach can be considered a semiautomated parameter selection. These studies are conducted using the PALISADE homomorphic encryption library, which is a well-known HE library. The abstract goes here.

Keywords: lattice cryptography, fully homomorphic encryption, parameter selection, LWE, RLWE

Procedia PDF Downloads 120
17211 Comparison of the Effectiveness of Tree Algorithms in Classification of Spongy Tissue Texture

Authors: Roza Dzierzak, Waldemar Wojcik, Piotr Kacejko

Abstract:

Analysis of the texture of medical images consists of determining the parameters and characteristics of the examined tissue. The main goal is to assign the analyzed area to one of two basic groups: as a healthy tissue or a tissue with pathological changes. The CT images of the thoracic lumbar spine from 15 healthy patients and 15 with confirmed osteoporosis were used for the analysis. As a result, 120 samples with dimensions of 50x50 pixels were obtained. The set of features has been obtained based on the histogram, gradient, run-length matrix, co-occurrence matrix, autoregressive model, and Haar wavelet. As a result of the image analysis, 290 descriptors of textural features were obtained. The dimension of the space of features was reduced by the use of three selection methods: Fisher coefficient (FC), mutual information (MI), minimization of the classification error probability and average correlation coefficients between the chosen features minimization of classification error probability (POE) and average correlation coefficients (ACC). Each of them returned ten features occupying the initial place in the ranking devised according to its own coefficient. As a result of the Fisher coefficient and mutual information selections, the same features arranged in a different order were obtained. In both rankings, the 50% percentile (Perc.50%) was found in the first place. The next selected features come from the co-occurrence matrix. The sets of features selected in the selection process were evaluated using six classification tree methods. These were: decision stump (DS), Hoeffding tree (HT), logistic model trees (LMT), random forest (RF), random tree (RT) and reduced error pruning tree (REPT). In order to assess the accuracy of classifiers, the following parameters were used: overall classification accuracy (ACC), true positive rate (TPR, classification sensitivity), true negative rate (TNR, classification specificity), positive predictive value (PPV) and negative predictive value (NPV). Taking into account the classification results, it should be stated that the best results were obtained for the Hoeffding tree and logistic model trees classifiers, using the set of features selected by the POE + ACC method. In the case of the Hoeffding tree classifier, the highest values of three parameters were obtained: ACC = 90%, TPR = 93.3% and PPV = 93.3%. Additionally, the values of the other two parameters, i.e., TNR = 86.7% and NPV = 86.6% were close to the maximum values obtained for the LMT classifier. In the case of logistic model trees classifier, the same ACC value was obtained ACC=90% and the highest values for TNR=88.3% and NPV= 88.3%. The values of the other two parameters remained at a level close to the highest TPR = 91.7% and PPV = 91.6%. The results obtained in the experiment show that the use of classification trees is an effective method of classification of texture features. This allows identifying the conditions of the spongy tissue for healthy cases and those with the porosis.

Keywords: classification, feature selection, texture analysis, tree algorithms

Procedia PDF Downloads 144
17210 Manufacturing Facility Location Selection: A Numercal Taxonomy Approach

Authors: Seifoddini Hamid, Mardikoraeem Mahsa, Ghorayshi Roya

Abstract:

Manufacturing facility location selection is an important strategic decision for many industrial corporations. In this paper, a new approach to the manufacturing location selection problem is proposed. In this approach, cluster analysis is employed to identify suitable manufacturing locations based on economic, social, environmental, and political factors. These factors are quantified using the existing real world data.

Keywords: manufacturing facility, manufacturing sites, real world data

Procedia PDF Downloads 538
17209 Optimizing Design Parameters for Efficient Saturated Steam Production in Fire Tube Boilers: A Cost-Effective Approach

Authors: Yoftahe Nigussie Worku

Abstract:

This research focuses on advancing fire tube boiler technology by systematically optimizing design parameters to achieve efficient saturated steam production. The main objective is to design a high-performance boiler with a production capacity of 2000kg/h at a 12-bar design pressure while minimizing costs. The methodology employs iterative analysis, utilizing relevant formulas, and considers material selection and production methods. The study successfully results in a boiler operating at 85.25% efficiency, with a fuel consumption rate of 140.37kg/hr and a heat output of 1610kW. Theoretical importance lies in balancing efficiency, safety considerations, and cost minimization. The research addresses key questions on parameter optimization, material choices, and safety-efficiency balance, contributing valuable insights to fire tube boiler design.

Keywords: safety consideration, efficiency, production methods, material selection

Procedia PDF Downloads 30
17208 Research on Urban Point of Interest Generalization Method Based on Mapping Presentation

Authors: Chengming Li, Yong Yin, Peipei Guo, Xiaoli Liu

Abstract:

Without taking account of the attribute richness of POI (point of interest) data and spatial distribution limited by roads, a POI generalization method considering both attribute information and spatial distribution has been proposed against the existing point generalization algorithm merely focusing on overall information of point groups. Hierarchical characteristic of urban POI information expression has been firstly analyzed to point out the measurement feature of the corresponding hierarchy. On this basis, an urban POI generalizing strategy has been put forward: POIs urban road network have been divided into three distribution pattern; corresponding generalization methods have been proposed according to the characteristic of POI data in different distribution patterns. Experimental results showed that the method taking into account both attribute information and spatial distribution characteristics of POI can better implement urban POI generalization in the mapping presentation.

Keywords: POI, road network, selection method, spatial information expression, distribution pattern

Procedia PDF Downloads 381
17207 Proposal of a Model Supporting Decision-Making on Information Security Risk Treatment

Authors: Ritsuko Kawasaki, Takeshi Hiromatsu

Abstract:

Management is required to understand all information security risks within an organization, and to make decisions on which information security risks should be treated in what level by allocating how much amount of cost. However, such decision-making is not usually easy, because various measures for risk treatment must be selected with the suitable application levels. In addition, some measures may have objectives conflicting with each other. It also makes the selection difficult. Therefore, this paper provides a model which supports the selection of measures by applying multi-objective analysis to find an optimal solution. Additionally, a list of measures is also provided to make the selection easier and more effective without any leakage of measures.

Keywords: information security risk treatment, selection of risk measures, risk acceptance, multi-objective optimization

Procedia PDF Downloads 350
17206 Methodology for the Selection of Chemical Textile Products

Authors: Oscar F. Toro, Alexia Pardo Figueroa, Brigitte M. Larico

Abstract:

The development of new processes in the textile industry entails designing methodologies to select adequate supplies that fit these new processes requirements. This paper presents a methodology to select chemicals that fulfill a new process technical specifications. The proposed methodology involves three major phases: (1) Data collection of chemical products, (2) Qualitative pre-selection and (3) Laboratory tests. We have applied this methodology to the selection of a binder which will form a protective film above the textile fibers and bond them. Our findings were that, there exist five possible products that can be used in our new process: Arkofil, Elvanol, Size plus A, Size plus AC and Starch. This new methodology has both qualitative and experimental variables, and can be used to select supplies for new textile processes.

Keywords: binder, chemical products, selection methodology, textile supplies, textile fiber

Procedia PDF Downloads 265