Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 5795

Search results for: features selection for CBIR

5705 Multi-Class Text Classification Using Ensembles of Classifiers

Authors: Syed Basit Ali Shah Bukhari, Yan Qiang, Saad Abdul Rauf, Syed Saqlaina Bukhari

Abstract:

Text Classification is the methodology to classify any given text into the respective category from a given set of categories. It is highly important and vital to use proper set of pre-processing , feature selection and classification techniques to achieve this purpose. In this paper we have used different ensemble techniques along with variance in feature selection parameters to see the change in overall accuracy of the result and also on some other individual class based features which include precision value of each individual category of the text. After subjecting our data through pre-processing and feature selection techniques , different individual classifiers were tested first and after that classifiers were combined to form ensembles to increase their accuracy. Later we also studied the impact of decreasing the classification categories on over all accuracy of data. Text classification is highly used in sentiment analysis on social media sites such as twitter for realizing people’s opinions about any cause or it is also used to analyze customer’s reviews about certain products or services. Opinion mining is a vital task in data mining and text categorization is a back-bone to opinion mining.

Keywords: Natural Language Processing, Ensemble Classifier, Bagging Classifier, AdaBoost

Procedia PDF Downloads 208

5704 A Review of Effective Gene Selection Methods for Cancer Classification Using Microarray Gene Expression Profile

Authors: Hala Alshamlan, Ghada Badr, Yousef Alohali

Abstract:

Cancer is one of the dreadful diseases, which causes considerable death rate in humans. DNA microarray-based gene expression profiling has been emerged as an efficient technique for cancer classification, as well as for diagnosis, prognosis, and treatment purposes. In recent years, a DNA microarray technique has gained more attraction in both scientific and in industrial fields. It is important to determine the informative genes that cause cancer to improve early cancer diagnosis and to give effective chemotherapy treatment. In order to gain deep insight into the cancer classification problem, it is necessary to take a closer look at the proposed gene selection methods. We believe that they should be an integral preprocessing step for cancer classification. Furthermore, finding an accurate gene selection method is a very significant issue in a cancer classification area because it reduces the dimensionality of microarray dataset and selects informative genes. In this paper, we classify and review the state-of-art gene selection methods. We proceed by evaluating the performance of each gene selection approach based on their classification accuracy and number of informative genes. In our evaluation, we will use four benchmark microarray datasets for the cancer diagnosis (leukemia, colon, lung, and prostate). In addition, we compare the performance of gene selection method to investigate the effective gene selection method that has the ability to identify a small set of marker genes, and ensure high cancer classification accuracy. To the best of our knowledge, this is the first attempt to compare gene selection approaches for cancer classification using microarray gene expression profile.

Keywords: gene selection, feature selection, cancer classification, microarray, gene expression profile

Procedia PDF Downloads 427

5703 Random Forest Classification for Population Segmentation

Authors: Regina Chua

Abstract:

To reduce the costs of re-fielding a large survey, a Random Forest classifier was applied to measure the accuracy of classifying individuals into their assigned segments with the fewest possible questions. Given a long survey, one needed to determine the most predictive ten or fewer questions that would accurately assign new individuals to custom segments. Furthermore, the solution needed to be quick in its classification and usable in non-Python environments. In this paper, a supervised Random Forest classifier was modeled on a dataset with 7,000 individuals, 60 questions, and 254 features. The Random Forest consisted of an iterative collection of individual decision trees that result in a predicted segment with robust precision and recall scores compared to a single tree. A random 70-30 stratified sampling for training the algorithm was used, and accuracy trade-offs at different depths for each segment were identified. Ultimately, the Random Forest classifier performed at 87% accuracy at a depth of 10 with 20 instead of 254 features and 10 instead of 60 questions. With an acceptable accuracy in prioritizing feature selection, new tools were developed for non-Python environments: a worksheet with a formulaic version of the algorithm and an embedded function to predict the segment of an individual in real-time. Random Forest was determined to be an optimal classification model by its feature selection, performance, processing speed, and flexible application in other environments.

Keywords: machine learning, supervised learning, data science, random forest, classification, prediction, predictive modeling

Procedia PDF Downloads 73

5702 An Improved Convolution Deep Learning Model for Predicting Trip Mode Scheduling

Authors: Amin Nezarat, Naeime Seifadini

Abstract:

Trip mode selection is a behavioral characteristic of passengers with immense importance for travel demand analysis, transportation planning, and traffic management. Identification of trip mode distribution will allow transportation authorities to adopt appropriate strategies to reduce travel time, traffic and air pollution. The majority of existing trip mode inference models operate based on human selected features and traditional machine learning algorithms. However, human selected features are sensitive to changes in traffic and environmental conditions and susceptible to personal biases, which can make them inefficient. One way to overcome these problems is to use neural networks capable of extracting high-level features from raw input. In this study, the convolutional neural network (CNN) architecture is used to predict the trip mode distribution based on raw GPS trajectory data. The key innovation of this paper is the design of the layout of the input layer of CNN as well as normalization operation, in a way that is not only compatible with the CNN architecture but can also represent the fundamental features of motion including speed, acceleration, jerk, and Bearing rate. The highest prediction accuracy achieved with the proposed configuration for the convolutional neural network with batch normalization is 85.26%.

Keywords: predicting, deep learning, neural network, urban trip

Procedia PDF Downloads 107

5701 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Saeed Hassan Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analysing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics

Procedia PDF Downloads 538

5700 Credit Card Fraud Detection with Ensemble Model: A Meta-Heuristic Approach

Authors: Gong Zhilin, Jing Yang, Jian Yin

Abstract:

The purpose of this paper is to develop a novel system for credit card fraud detection based on sequential modeling of data using hybrid deep learning models. The projected model encapsulates five major phases are pre-processing, imbalance-data handling, feature extraction, optimal feature selection, and fraud detection with an ensemble classifier. The collected raw data (input) is pre-processed to enhance the quality of the data through alleviation of the missing data, noisy data as well as null values. The pre-processed data are class imbalanced in nature, and therefore they are handled effectively with the K-means clustering-based SMOTE model. From the balanced class data, the most relevant features like improved Principal Component Analysis (PCA), statistical features (mean, median, standard deviation) and higher-order statistical features (skewness and kurtosis). Among the extracted features, the most optimal features are selected with the Self-improved Arithmetic Optimization Algorithm (SI-AOA). This SI-AOA model is the conceptual improvement of the standard Arithmetic Optimization Algorithm. The deep learning models like Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and optimized Quantum Deep Neural Network (QDNN). The LSTM and CNN are trained with the extracted optimal features. The outcomes from LSTM and CNN will enter as input to optimized QDNN that provides the final detection outcome. Since the QDNN is the ultimate detector, its weight function is fine-tuned with the Self-improved Arithmetic Optimization Algorithm (SI-AOA).

Keywords: credit card, data mining, fraud detection, money transactions

Procedia PDF Downloads 104

5699 Selection of Solid Waste Landfill Site Using Geographical Information System (GIS)

Authors: Fatih Iscan, Ceren Yagci

Abstract:

Rapid population growth, urbanization and industrialization are known as the most important factors of environment problems. Elimination and management of solid wastes are also within the most important environment problems. One of the main problems in solid waste management is the selection of the best site for elimination of solid wastes. Lately, Geographical Information System (GIS) has been used for easing selection of landfill area. GIS has the ability of imitating necessary economical, environmental and political limitations. They play an important role for the site selection of landfill area as a decision support tool. In this study; map layers will be studied for minimum effect of environmental, social and cultural factors and maximum effect for engineering/economical factors for site selection of landfill areas and using GIS for an decision support mechanism in solid waste landfill areas site selection will be presented in Aksaray/TURKEY city, Güzelyurt district practice.

Keywords: GIS, landfill, solid waste, spatial analysis

Procedia PDF Downloads 333

5698 A Two-Stage Bayesian Variable Selection Method with the Extension of Lasso for Geo-Referenced Data

Authors: Georgiana Onicescu, Yuqian Shen

Abstract:

Due to the complex nature of geo-referenced data, multicollinearity of the risk factors in public health spatial studies is a commonly encountered issue, which leads to low parameter estimation accuracy because it inflates the variance in the regression analysis. To address this issue, we proposed a two-stage variable selection method by extending the least absolute shrinkage and selection operator (Lasso) to the Bayesian spatial setting, investigating the impact of risk factors to health outcomes. Specifically, in stage I, we performed the variable selection using Bayesian Lasso and several other variable selection approaches. Then, in stage II, we performed the model selection with only the selected variables from stage I and compared again the methods. To evaluate the performance of the two-stage variable selection methods, we conducted a simulation study with different distributions for the risk factors, using geo-referenced count data as the outcome and Michigan as the research region. We considered the cases when all candidate risk factors are independently normally distributed, or follow a multivariate normal distribution with different correlation levels. Two other Bayesian variable selection methods, Binary indicator, and the combination of Binary indicator and Lasso were considered and compared as alternative methods. The simulation results indicated that the proposed two-stage Bayesian Lasso variable selection method has the best performance for both independent and dependent cases considered. When compared with the one-stage approach, and the other two alternative methods, the two-stage Bayesian Lasso approach provides the highest estimation accuracy in all scenarios considered.

Keywords: Lasso, Bayesian analysis, spatial analysis, variable selection

Procedia PDF Downloads 111

5697 Active Features Determination: A Unified Framework

Authors: Meenal Badki

Abstract:

We address the issue of active feature determination, where the objective is to determine the set of examples on which additional data (such as lab tests) needs to be gathered, given a large number of examples with some features (such as demographics) and some examples with all the features (such as the complete Electronic Health Record). We note that certain features may be more costly, unique, or laborious to gather. Our proposal is a general active learning approach that is independent of classifiers and similarity metrics. It allows us to identify examples that differ from the full data set and obtain all the features for the examples that match. Our comprehensive evaluation shows the efficacy of this approach, which is driven by four authentic clinical tasks.

Keywords: feature determination, classification, active learning, sample-efficiency

Procedia PDF Downloads 39

5696 Predicting Stack Overflow Accepted Answers Using Features and Models with Varying Degrees of Complexity

Authors: Osayande Pascal Omondiagbe, Sherlock a Licorish

Abstract:

Stack Overflow is a popular community question and answer portal which is used by practitioners to solve technology-related challenges during software development. Previous studies have shown that this forum is becoming a substitute for official software programming languages documentation. While tools have looked to aid developers by presenting interfaces to explore Stack Overflow, developers often face challenges searching through many possible answers to their questions, and this extends the development time. To this end, researchers have provided ways of predicting acceptable Stack Overflow answers by using various modeling techniques. However, less interest is dedicated to examining the performance and quality of typically used modeling methods, and especially in relation to models’ and features’ complexity. Such insights could be of practical significance to the many practitioners that use Stack Overflow. This study examines the performance and quality of various modeling methods that are used for predicting acceptable answers on Stack Overflow, drawn from 2014, 2015 and 2016. Our findings reveal significant differences in models’ performance and quality given the type of features and complexity of models used. Researchers examining classifiers’ performance and quality and features’ complexity may leverage these findings in selecting suitable techniques when developing prediction models.

Keywords: feature selection, modeling and prediction, neural network, random forest, stack overflow

Procedia PDF Downloads 112

5695 2D Point Clouds Features from Radar for Helicopter Classification

Authors: Danilo Habermann, Aleksander Medella, Carla Cremon, Yusef Caceres

Abstract:

This paper aims to analyze the ability of 2d point clouds features to classify different models of helicopters using radars. This method does not need to estimate the blade length, the number of blades of helicopters, and the period of their micro-Doppler signatures. It is also not necessary to generate spectrograms (or any other image based on time and frequency domain). This work transforms a radar return signal into a 2D point cloud and extracts features of it. Three classifiers are used to distinguish 9 different helicopter models in order to analyze the performance of the features used in this work. The high accuracy obtained with each of the classifiers demonstrates that the 2D point clouds features are very useful for classifying helicopters from radar signal.

Keywords: helicopter classification, point clouds features, radar, supervised classifiers

Procedia PDF Downloads 190

5694 Dimensionality Reduction in Modal Analysis for Structural Health Monitoring

Authors: Elia Favarelli, Enrico Testi, Andrea Giorgetti

Abstract:

Autonomous structural health monitoring (SHM) of many structures and bridges became a topic of paramount importance for maintenance purposes and safety reasons. This paper proposes a set of machine learning (ML) tools to perform automatic feature selection and detection of anomalies in a bridge from vibrational data and compare different feature extraction schemes to increase the accuracy and reduce the amount of data collected. As a case study, the Z-24 bridge is considered because of the extensive database of accelerometric data in both standard and damaged conditions. The proposed framework starts from the first four fundamental frequencies extracted through operational modal analysis (OMA) and clustering, followed by density-based time-domain filtering (tracking). The fundamental frequencies extracted are then fed to a dimensionality reduction block implemented through two different approaches: feature selection (intelligent multiplexer) that tries to estimate the most reliable frequencies based on the evaluation of some statistical features (i.e., mean value, variance, kurtosis), and feature extraction (auto-associative neural network (ANN)) that combine the fundamental frequencies to extract new damage sensitive features in a low dimensional feature space. Finally, one class classifier (OCC) algorithms perform anomaly detection, trained with standard condition points, and tested with normal and anomaly ones. In particular, a new anomaly detector strategy is proposed, namely one class classifier neural network two (OCCNN2), which exploit the classification capability of standard classifiers in an anomaly detection problem, finding the standard class (the boundary of the features space in normal operating conditions) through a two-step approach: coarse and fine boundary estimation. The coarse estimation uses classics OCC techniques, while the fine estimation is performed through a feedforward neural network (NN) trained that exploits the boundaries estimated in the coarse step. The detection algorithms vare then compared with known methods based on principal component analysis (PCA), kernel principal component analysis (KPCA), and auto-associative neural network (ANN). In many cases, the proposed solution increases the performance with respect to the standard OCC algorithms in terms of F1 score and accuracy. In particular, by evaluating the correct features, the anomaly can be detected with accuracy and an F1 score greater than 96% with the proposed method.

Keywords: anomaly detection, frequencies selection, modal analysis, neural network, sensor network, structural health monitoring, vibration measurement

Procedia PDF Downloads 97

5693 Dynamic Gabor Filter Facial Features-Based Recognition of Emotion in Video Sequences

Authors: T. Hari Prasath, P. Ithaya Rani

Abstract:

In the world of visual technology, recognizing emotions from the face images is a challenging task. Several related methods have not utilized the dynamic facial features effectively for high performance. This paper proposes a method for emotions recognition using dynamic facial features with high performance. Initially, local features are captured by Gabor filter with different scale and orientations in each frame for finding the position and scale of face part from different backgrounds. The Gabor features are sent to the ensemble classifier for detecting Gabor facial features. The region of dynamic features is captured from the Gabor facial features in the consecutive frames which represent the dynamic variations of facial appearances. In each region of dynamic features is normalized using Z-score normalization method which is further encoded into binary pattern features with the help of threshold values. The binary features are passed to Multi-class AdaBoost classifier algorithm with the well-trained database contain happiness, sadness, surprise, fear, anger, disgust, and neutral expressions to classify the discriminative dynamic features for emotions recognition. The developed method is deployed on the Ryerson Multimedia Research Lab and Cohn-Kanade databases and they show significant performance improvement owing to their dynamic features when compared with the existing methods.

Keywords: detecting face, Gabor filter, multi-class AdaBoost classifier, Z-score normalization

Procedia PDF Downloads 250

5692 A Theoretical Framework for Conceptualizing Integration of Environmental Sustainability into Supplier Selection

Authors: Tonny Ograh, Joshua Ayarkwa, Dickson Osei-Asibey, Alex Acheampong, Peter Amoah

Abstract:

Theories are used to improve the conceptualization of research ideas. These theories enhance valuable elucidations that help us to grasp the meaning of research findings. Nevertheless, the use of theories to promote studies in green supplier selection in procurement decisions has attracted little attention. With the emergence of sustainable procurement, public procurement practitioners in Ghana are yet to achieve relevant knowledge on green supplier selections due to insufficient knowledge and inadequate appropriate frameworks. The flagrancy of the consequences of public procurers’ failure to integrate environmental considerations into supplier selection explains the adoption of a multi-theory approach for comprehension of the dynamics of green integration into supplier selection. In this paper, the practicality of three theories for improving the understanding of the influential factors enhancing the integration of environmental sustainability into supplier selection was reviewed. The three theories are Resource-Based Theory, Human Capital Theory and Absorptive Capacity Theory. This review uncovered knowledge management, top management commitment, and environmental management capabilities as important elements needed for the integration of environmental sustainability into supplier selection in public procurement. The theoretical review yielded a framework that conceptualizes knowledge and capabilities of practitioners relevant to the incorporation of environmental sustainability into supplier selection in public procurement.

Keywords: environmental, sustainability, supplier selection, environmental procurement, sustainable procurement

Procedia PDF Downloads 151

5691 Advanced Technologies and Algorithms for Efficient Portfolio Selection

Authors: Konstantinos Liagkouras, Konstantinos Metaxiotis

Abstract:

In this paper we present a classification of the various technologies applied for the solution of the portfolio selection problem according to the discipline and the methodological framework followed. We provide a concise presentation of the emerged categories and we are trying to identify which methods considered obsolete and which lie at the heart of the debate. On top of that, we provide a comparative study of the different technologies applied for efficient portfolio construction and we suggest potential paths for future work that lie at the intersection of the presented techniques.

Keywords: portfolio selection, optimization techniques, financial models, stochastic, heuristics

Procedia PDF Downloads 405

5690 New Features for Copy-Move Image Forgery Detection

Authors: Michael Zimba

Abstract:

A novel set of features for copy-move image forgery, CMIF, detection method is proposed. The proposed set presents a new approach which relies on electrostatic field theory, EFT. Solely for the purpose of reducing the dimension of a suspicious image, firstly performs discrete wavelet transform, DWT, of the suspicious image and extracts only the approximation subband. The extracted subband is then bijectively mapped onto a virtual electrostatic field where concepts of EFT are utilised to extract robust features. The extracted features are shown to be invariant to additive noise, JPEG compression, and affine transformation. The proposed features can also be used in general object matching.

Keywords: virtual electrostatic field, features, affine transformation, copy-move image forgery

Procedia PDF Downloads 521

5689 Enhancing the Recruitment Process through Machine Learning: An Automated CV Screening System

Authors: Kaoutar Ben Azzou, Hanaa Talei

Abstract:

Human resources is an important department in each organization as it manages the life cycle of employees from recruitment training to retirement or termination of contracts. The recruitment process starts with a job opening, followed by a selection of the best-fit candidates from all applicants. Matching the best profile for a job position requires a manual way of looking at many CVs, which requires hours of work that can sometimes lead to choosing not the best profile. The work presented in this paper aims at reducing the workload of HR personnel by automating the preliminary stages of the candidate screening process, thereby fostering a more streamlined recruitment workflow. This tool introduces an automated system designed to help with the recruitment process by scanning candidates' CVs, extracting pertinent features, and employing machine learning algorithms to decide the most fitting job profile for each candidate. Our work employs natural language processing (NLP) techniques to identify and extract key features from unstructured text extracted from a CV, such as education, work experience, and skills. Subsequently, the system utilizes these features to match candidates with job profiles, leveraging the power of classification algorithms.

Keywords: automated recruitment, candidate screening, machine learning, human resources management

Procedia PDF Downloads 29

5688 The Discussion on the Composition of Feng Shui by the Environmental Planning Viewpoint

Authors: Jhuang Jin-Jhong, Hsieh Wei-Fan

Abstract:

Climate change causes natural disasters persistently. Therefore, nowadays environmental planning objective tends to the issues of respecting nature and coexisting with nature. As a result, the natural environment analysis, e.g., the analysis of topography, soil, hydrology, climate, vegetation, is highly emphasized. On the other hand, Feng Shui has been a criterion of site selection for residence in Eastern since the ancient times and has had farther influence on site selection for castles and even for temples and tombs. The primary criterion of site selection is judging the quality of Long: mountain range, Sha: nearby mountains, Shui: hydrology, Xue: foundation, Xiang: aspect, which are similar to the environmental variables of mountain range, topography, hydrology and aspect. For the reason, a lot researchers attempt to probe into the connection between the criterion of Feng Shui and environmental planning factors. Most researches only discussed with the composition and theory of space of Feng Shui, but there is no research which explained Feng Shui through the environmental field. Consequently, this study reviewed the theory of Feng Shui through the environmental planning viewpoint and assembled essential composition factors of Feng Shui. The results of this study point. From literature review and comparison of theoretical meanings, we find that the ideal principles for planning the Feng Shui environment can also be used for environmental planning. Therefore, this article uses 12 ideal environmental features used in Feng Shui to contrast the natural aspects of the environment and make comparisons with previous research and classifies the environmental factors into climate, topography, hydrology, vegetation, and soil.

Keywords: the composition of Feng Shui, environmental planning, site selection, main components of the Feng Shui environment

Procedia PDF Downloads 486

5687 Parameter Selection for Computationally Efficient Use of the Bfvrns Fully Homomorphic Encryption Scheme

Authors: Cavidan Yakupoglu, Kurt Rohloff

Abstract:

In this study, we aim to provide a novel parameter selection model for the BFVrns scheme, which is one of the prominent FHE schemes. Parameter selection in lattice-based FHE schemes is a practical challenges for experts or non-experts. Towards a solution to this problem, we introduce a hybrid principles-based approach that combines theoretical with experimental analyses. To begin, we use regression analysis to examine the parameters on the performance and security. The fact that the FHE parameters induce different behaviors on performance, security and Ciphertext Expansion Factor (CEF) that makes the process of parameter selection more challenging. To address this issue, We use a multi-objective optimization algorithm to select the optimum parameter set for performance, CEF and security at the same time. As a result of this optimization, we get an improved parameter set for better performance at a given security level by ensuring correctness and security against lattice attacks by providing at least 128-bit security. Our result enables average ~ 5x smaller CEF and mostly better performance in comparison to the parameter sets given in [1]. This approach can be considered a semiautomated parameter selection. These studies are conducted using the PALISADE homomorphic encryption library, which is a well-known HE library. The abstract goes here.

Keywords: lattice cryptography, fully homomorphic encryption, parameter selection, LWE, RLWE

Procedia PDF Downloads 123

5686 Evaluation and Selection of Construction Contractors by Polish Public Clients

Authors: Kozik Renata, Leśniak Agnieszka, Plebankiewicz Edyta

Abstract:

Contracting authorities in the public sector are obligated to apply the principles provided for in the Polish law for the evaluation and selection of contractors. To analyze the methods of contractors, applied in practice by public clients, the notices of contract award results for construction works were analyzed. The analysis shows that the procedure selected more and more often is open to competitive bidding, where the assessment of the competence of contractors is not very precise, as well as non-competitive bidding, i.e. single source procurement. The share of procurement procedures, where the only criterion is price, is increasing. The solution to the problems existing here might be the introduction of one of the forms of pre-selection of contractors. The article also briefly discusses verification systems for companies applying for public contracts used in EU countries.

Keywords: certification, contractors selection, open tendering, public investors

Procedia PDF Downloads 260

5685 Manufacturing Facility Location Selection: A Numercal Taxonomy Approach

Authors: Seifoddini Hamid, Mardikoraeem Mahsa, Ghorayshi Roya

Abstract:

Manufacturing facility location selection is an important strategic decision for many industrial corporations. In this paper, a new approach to the manufacturing location selection problem is proposed. In this approach, cluster analysis is employed to identify suitable manufacturing locations based on economic, social, environmental, and political factors. These factors are quantified using the existing real world data.

Keywords: manufacturing facility, manufacturing sites, real world data

Procedia PDF Downloads 541

5684 Proposal of a Model Supporting Decision-Making on Information Security Risk Treatment

Authors: Ritsuko Kawasaki, Takeshi Hiromatsu

Abstract:

Management is required to understand all information security risks within an organization, and to make decisions on which information security risks should be treated in what level by allocating how much amount of cost. However, such decision-making is not usually easy, because various measures for risk treatment must be selected with the suitable application levels. In addition, some measures may have objectives conflicting with each other. It also makes the selection difficult. Therefore, this paper provides a model which supports the selection of measures by applying multi-objective analysis to find an optimal solution. Additionally, a list of measures is also provided to make the selection easier and more effective without any leakage of measures.

Keywords: information security risk treatment, selection of risk measures, risk acceptance, multi-objective optimization

Procedia PDF Downloads 352

5683 Methodology for the Selection of Chemical Textile Products

Authors: Oscar F. Toro, Alexia Pardo Figueroa, Brigitte M. Larico

Abstract:

The development of new processes in the textile industry entails designing methodologies to select adequate supplies that fit these new processes requirements. This paper presents a methodology to select chemicals that fulfill a new process technical specifications. The proposed methodology involves three major phases: (1) Data collection of chemical products, (2) Qualitative pre-selection and (3) Laboratory tests. We have applied this methodology to the selection of a binder which will form a protective film above the textile fibers and bond them. Our findings were that, there exist five possible products that can be used in our new process: Arkofil, Elvanol, Size plus A, Size plus AC and Starch. This new methodology has both qualitative and experimental variables, and can be used to select supplies for new textile processes.

Keywords: binder, chemical products, selection methodology, textile supplies, textile fiber

Procedia PDF Downloads 267

5682 Positive Bias and Length Bias in Deep Neural Networks for Premises Selection

Authors: Jiaqi Huang, Yuheng Wang

Abstract:

Premises selection, the task of selecting a set of axioms for proving a given conjecture, is a major bottleneck in automated theorem proving. An array of deep-learning-based methods has been established for premises selection, but a perfect performance remains challenging. Our study examines the inaccuracy of deep neural networks in premises selection. Through training network models using encoded conjecture and axiom pairs from the Mizar Mathematical Library, two potential biases are found: the network models classify more premises as necessary than unnecessary, referred to as the ‘positive bias’, and the network models perform better in proving conjectures that paired with more axioms, referred to as ‘length bias’. The ‘positive bias’ and ‘length bias’ discovered could inform the limitation of existing deep neural networks.

Keywords: automated theorem proving, premises selection, deep learning, interpreting deep learning

Procedia PDF Downloads 153

5681 A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning

Authors: Samina Khalid, Shamila Nasreen

Abstract:

Dimensionality reduction as a preprocessing step to machine learning is effective in removing irrelevant and redundant data, increasing learning accuracy, and improving result comprehensibility. However, the recent increase of dimensionality of data poses a severe challenge to many existing feature selection and feature extraction methods with respect to efficiency and effectiveness. In the field of machine learning and pattern recognition, dimensionality reduction is important area, where many approaches have been proposed. In this paper, some widely used feature selection and feature extraction techniques have analyzed with the purpose of how effectively these techniques can be used to achieve high performance of learning algorithms that ultimately improves predictive accuracy of classifier. An endeavor to analyze dimensionality reduction techniques briefly with the purpose to investigate strengths and weaknesses of some widely used dimensionality reduction methods is presented.

Keywords: age related macular degeneration, feature selection feature subset selection feature extraction/transformation, FSA’s, relief, correlation based method, PCA, ICA

Procedia PDF Downloads 461

5680 Partner Selection for Innovation Projects Related to New Product Concept Design

Authors: Odd Jarl Borch, Marina Z. Solesvik

Abstract:

The paper analyses partner selection approaches related to large scale R&D-based innovation projects at the different stages of development. We emphasize innovation projects in the maritime value chain and how partners are selected to improve quality according to high spec customer demands, and to reduce investment costs on new production technology such as advanced offshore service vessels. We elaborate on the differences in innovation approach and especially the role that purposive inflows and outflows of knowledge from external partners may be used to accelerate internal innovation. We present three cases related to different projects in terms of specificity and scope. We explore how the partner selection criteria change over time when the goals move from wide scope to a very specific R&D tasks.

Keywords: partner selection, innovation, offshore industry, concept design

Procedia PDF Downloads 492

5679 Multi-Criteria Test Case Selection Using Ant Colony Optimization

Authors: Niranjana Devi N.

Abstract:

Test case selection is to select the subset of only the fit test cases and remove the unfit, ambiguous, redundant, unnecessary test cases which in turn improve the quality and reduce the cost of software testing. Test cases optimization is the problem of finding the best subset of test cases from a pool of the test cases to be audited. It will meet all the objectives of testing concurrently. But most of the research have evaluated the fitness of test cases only on single parameter fault detecting capability and optimize the test cases using a single objective. In the proposed approach, nine parameters are considered for test case selection and the best subset of parameters for test case selection is obtained using Interval Type-2 Fuzzy Rough Set. Test case selection is done in two stages. The first stage is the fuzzy entropy-based filtration technique, used for estimating and reducing the ambiguity in test case fitness evaluation and selection. The second stage is the ant colony optimization-based wrapper technique with a forward search strategy, employed to select test cases from the reduced test suite of the first stage. The results are evaluated using the Coverage parameters, Precision, Recall, F-Measure, APSC, APDC, and SSR. The experimental evaluation demonstrates that by this approach considerable computational effort can be avoided.

Keywords: ant colony optimization, fuzzy entropy, interval type-2 fuzzy rough set, test case selection

Procedia PDF Downloads 635

5678 Detection and Classification of Mammogram Images Using Principle Component Analysis and Lazy Classifiers

Authors: Rajkumar Kolangarakandy

Abstract:

Feature extraction and selection is the primary part of any mammogram classification algorithms. The choice of feature, attribute or measurements have an important influence in any classification system. Discrete Wavelet Transformation (DWT) coefficients are one of the prominent features for representing images in frequency domain. The features obtained after the decomposition of the mammogram images using wavelet transformations have higher dimension. Even though the features are higher in dimension, they were highly correlated and redundant in nature. The dimensionality reduction techniques play an important role in selecting the optimum number of features from the higher dimension data, which are highly correlated. PCA is a mathematical tool that reduces the dimensionality of the data while retaining most of the variation in the dataset. In this paper, a multilevel classification of mammogram images using reduced discrete wavelet transformation coefficients and lazy classifiers is proposed. The classification is accomplished in two different levels. In the first level, mammogram ROIs extracted from the dataset is classified as normal and abnormal types. In the second level, all the abnormal mammogram ROIs is classified into benign and malignant too. A further classification is also accomplished based on the variation in structure and intensity distribution of the images in the dataset. The Lazy classifiers called Kstar, IBL and LWL are used for classification. The classification results obtained with the reduced feature set is highly promising and the result is also compared with the performance obtained without dimension reduction.

Keywords: PCA, wavelet transformation, lazy classifiers, Kstar, IBL, LWL

Procedia PDF Downloads 316

5677 Using Reservoir Models for Monitoring Geothermal Surface Features

Authors: John P. O’Sullivan, Thomas M. P. Ratouis, Michael J. O’Sullivan

Abstract:

As the use of geothermal energy grows internationally more effort is required to monitor and protect areas with rare and important geothermal surface features. A number of approaches are presented for developing and calibrating numerical geothermal reservoir models that are capable of accurately representing geothermal surface features. The approaches are discussed in the context of cases studies of the Rotorua geothermal system and the Orakei-korako geothermal system, both of which contain important surface features. The results show that models are able to match the available field data accurately and hence can be used as valuable tools for predicting the future response of the systems to changes in use.

Keywords: geothermal reservoir models, surface features, monitoring, TOUGH2

Procedia PDF Downloads 383

5676 Music Genre Classification Based on Non-Negative Matrix Factorization Features

Authors: Soyon Kim, Edward Kim

Abstract:

In order to retrieve information from the massive stream of songs in the music industry, music search by title, lyrics, artist, mood, and genre has become more important. Despite the subjectivity and controversy over the definition of music genres across different nations and cultures, automatic genre classification systems that facilitate the process of music categorization have been developed. Manual genre selection by music producers is being provided as statistical data for designing automatic genre classification systems. In this paper, an automatic music genre classification system utilizing non-negative matrix factorization (NMF) is proposed. Short-term characteristics of the music signal can be captured based on the timbre features such as mel-frequency cepstral coefficient (MFCC), decorrelated filter bank (DFB), octave-based spectral contrast (OSC), and octave band sum (OBS). Long-term time-varying characteristics of the music signal can be summarized with (1) the statistical features such as mean, variance, minimum, and maximum of the timbre features and (2) the modulation spectrum features such as spectral flatness measure, spectral crest measure, spectral peak, spectral valley, and spectral contrast of the timbre features. Not only these conventional basic long-term feature vectors, but also NMF based feature vectors are proposed to be used together for genre classification. In the training stage, NMF basis vectors were extracted for each genre class. The NMF features were calculated in the log spectral magnitude domain (NMF-LSM) as well as in the basic feature vector domain (NMF-BFV). For NMF-LSM, an entire full band spectrum was used. However, for NMF-BFV, only low band spectrum was used since high frequency modulation spectrum of the basic feature vectors did not contain important information for genre classification. In the test stage, using the set of pre-trained NMF basis vectors, the genre classification system extracted the NMF weighting values of each genre as the NMF feature vectors. A support vector machine (SVM) was used as a classifier. The GTZAN multi-genre music database was used for training and testing. It is composed of 10 genres and 100 songs for each genre. To increase the reliability of the experiments, 10-fold cross validation was used. For a given input song, an extracted NMF-LSM feature vector was composed of 10 weighting values that corresponded to the classification probabilities for 10 genres. An NMF-BFV feature vector also had a dimensionality of 10. Combined with the basic long-term features such as statistical features and modulation spectrum features, the NMF features provided the increased accuracy with a slight increase in feature dimensionality. The conventional basic features by themselves yielded 84.0% accuracy, but the basic features with NMF-LSM and NMF-BFV provided 85.1% and 84.2% accuracy, respectively. The basic features required dimensionality of 460, but NMF-LSM and NMF-BFV required dimensionalities of 10 and 10, respectively. Combining the basic features, NMF-LSM and NMF-BFV together with the SVM with a radial basis function (RBF) kernel produced the significantly higher classification accuracy of 88.3% with a feature dimensionality of 480.

Keywords: mel-frequency cepstral coefficient (MFCC), music genre classification, non-negative matrix factorization (NMF), support vector machine (SVM)

Procedia PDF Downloads 271