Search results for: feature selection
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3617

Search results for: feature selection

3257 Solution of Logistics Center Selection Problem Using the Axiomatic Design Method

Authors: Fulya Zaralı, Harun Resit Yazgan

Abstract:

Logistics centers represent areas that all national and international logistics and activities related to logistics can be implemented by the various businesses. Logistics centers have a key importance in joining the transport stream and the transport system operations. Therefore, it is important where these centers are positioned to be effective and efficient and to show the expected performance of the centers. In this study, the location selection problem to position the logistics center is discussed. Alternative centers are evaluated according certain criteria. The most appropriate center is identified using the axiomatic design method.

Keywords: axiomatic design, logistic center, facility location, information systems

Procedia PDF Downloads 327
3256 Closest Possible Neighbor of a Different Class: Explaining a Model Using a Neighbor Migrating Generator

Authors: Hassan Eshkiki, Benjamin Mora

Abstract:

The Neighbor Migrating Generator is a simple and efficient approach to finding the closest potential neighbor(s) with a different label for a given instance and so without the need to calibrate any kernel settings at all. This allows determining and explaining the most important features that will influence an AI model. It can be used to either migrate a specific sample to the class decision boundary of the original model within a close neighborhood of that sample or identify global features that can help localising neighbor classes. The proposed technique works by minimizing a loss function that is divided into two components which are independently weighted according to three parameters α, β, and ω, α being self-adjusting. Results show that this approach is superior to past techniques when detecting the smallest changes in the feature space and may also point out issues in models like over-fitting.

Keywords: explainable AI, EX AI, feature importance, counterfactual explanations

Procedia PDF Downloads 127
3255 Video Stabilization Using Feature Point Matching

Authors: Shamsundar Kulkarni

Abstract:

Video capturing by non-professionals will lead to unanticipated effects. Such as image distortion, image blurring etc. Hence, many researchers study such drawbacks to enhance the quality of videos. In this paper, an algorithm is proposed to stabilize jittery videos .A stable output video will be attained without the effect of jitter which is caused due to shaking of handheld camera during video recording. Firstly, salient points from each frame from the input video are identified and processed followed by optimizing and stabilize the video. Optimization includes the quality of the video stabilization. This method has shown good result in terms of stabilization and it discarded distortion from the output videos recorded in different circumstances.

Keywords: video stabilization, point feature matching, salient points, image quality measurement

Procedia PDF Downloads 285
3254 Developing an Out-of-Distribution Generalization Model Selection Framework through Impurity and Randomness Measurements and a Bias Index

Authors: Todd Zhou, Mikhail Yurochkin

Abstract:

Out-of-distribution (OOD) detection is receiving increasing amounts of attention in the machine learning research community, boosted by recent technologies, such as autonomous driving and image processing. This newly-burgeoning field has called for the need for more effective and efficient methods for out-of-distribution generalization methods. Without accessing the label information, deploying machine learning models to out-of-distribution domains becomes extremely challenging since it is impossible to evaluate model performance on unseen domains. To tackle this out-of-distribution detection difficulty, we designed a model selection pipeline algorithm and developed a model selection framework with different impurity and randomness measurements to evaluate and choose the best-performing models for out-of-distribution data. By exploring different randomness scores based on predicted probabilities, we adopted the out-of-distribution entropy and developed a custom-designed score, ”CombinedScore,” as the evaluation criterion. This proposed score was created by adding labeled source information into the judging space of the uncertainty entropy score using harmonic mean. Furthermore, the prediction bias was explored through the equality of opportunity violation measurement. We also improved machine learning model performance through model calibration. The effectiveness of the framework with the proposed evaluation criteria was validated on the Folktables American Community Survey (ACS) datasets.

Keywords: model selection, domain generalization, model fairness, randomness measurements, bias index

Procedia PDF Downloads 102
3253 Weighted Rank Regression with Adaptive Penalty Function

Authors: Kang-Mo Jung

Abstract:

The use of regularization for statistical methods has become popular. The least absolute shrinkage and selection operator (LASSO) framework has become the standard tool for sparse regression. However, it is well known that the LASSO is sensitive to outliers or leverage points. We consider a new robust estimation which is composed of the weighted loss function of the pairwise difference of residuals and the adaptive penalty function regulating the tuning parameter for each variable. Rank regression is resistant to regression outliers, but not to leverage points. By adopting a weighted loss function, the proposed method is robust to leverage points of the predictor variable. Furthermore, the adaptive penalty function gives us good statistical properties in variable selection such as oracle property and consistency. We develop an efficient algorithm to compute the proposed estimator using basic functions in program R. We used an optimal tuning parameter based on the Bayesian information criterion (BIC). Numerical simulation shows that the proposed estimator is effective for analyzing real data set and contaminated data.

Keywords: adaptive penalty function, robust penalized regression, variable selection, weighted rank regression

Procedia PDF Downloads 436
3252 Firm Level Productivity Heterogeneity and Export Behavior: Evidence from UK

Authors: Umut Erksan Senalp

Abstract:

The aim of this study is to examine the link between firm level productivity heterogeneity and firm’s decision to export. Thus, we test the self selection hypothesis which suggests only more productive firms self select themselves to export markets. We analyze UK manufacturing sector by using firm-level data for the period 2003-2011. Although our preliminary results suggest that exporters outperform non-exporters when we pool all manufacturing industries, when we examine each industry individually, we find that self-selection hypothesis does not hold for each industries.

Keywords: total factor productivity, firm heterogeneity, international trade, decision to export

Procedia PDF Downloads 340
3251 Constructing a Semi-Supervised Model for Network Intrusion Detection

Authors: Tigabu Dagne Akal

Abstract:

While advances in computer and communications technology have made the network ubiquitous, they have also rendered networked systems vulnerable to malicious attacks devised from a distance. These attacks or intrusions start with attackers infiltrating a network through a vulnerable host and then launching further attacks on the local network or Intranet. Nowadays, system administrators and network professionals can attempt to prevent such attacks by developing intrusion detection tools and systems using data mining technology. In this study, the experiments were conducted following the Knowledge Discovery in Database Process Model. The Knowledge Discovery in Database Process Model starts from selection of the datasets. The dataset used in this study has been taken from Massachusetts Institute of Technology Lincoln Laboratory. After taking the data, it has been pre-processed. The major pre-processing activities include fill in missed values, remove outliers; resolve inconsistencies, integration of data that contains both labelled and unlabelled datasets, dimensionality reduction, size reduction and data transformation activity like discretization tasks were done for this study. A total of 21,533 intrusion records are used for training the models. For validating the performance of the selected model a separate 3,397 records are used as a testing set. For building a predictive model for intrusion detection J48 decision tree and the Naïve Bayes algorithms have been tested as a classification approach for both with and without feature selection approaches. The model that was created using 10-fold cross validation using the J48 decision tree algorithm with the default parameter values showed the best classification accuracy. The model has a prediction accuracy of 96.11% on the training datasets and 93.2% on the test dataset to classify the new instances as normal, DOS, U2R, R2L and probe classes. The findings of this study have shown that the data mining methods generates interesting rules that are crucial for intrusion detection and prevention in the networking industry. Future research directions are forwarded to come up an applicable system in the area of the study.

Keywords: intrusion detection, data mining, computer science, data mining

Procedia PDF Downloads 270
3250 Technology Enriched Classroom for Intercultural Competence Building through Films

Authors: Tamara Matevosyan

Abstract:

In this globalized world, intercultural communication is becoming essential for understanding communication among people, for developing understanding of cultures, to appreciate the opportunities and challenges that each culture presents to people. Moreover, it plays an important role in developing an ideal personification to understand different behaviors in different cultures. Native speakers assimilate sociolinguistic knowledge in natural conditions, while it is a great problem for language learners, and in this context feature films reveal cultural peculiarities and involve students in real communication. As we know nowadays the key role of language learning is the development of intercultural competence as communicating with someone from a different cultural background can be exciting and scary, frustrating and enlightening. Intercultural competence is important in FL learning classroom and here feature films can perform as essential tools to develop this competence and overcome the intercultural gap that foreign students face. Current proposal attempts to reveal the correlation of the given culture and language through feature films. To ensure qualified, well-organized and practical classes on Intercultural Communication for language learners a number of methods connected with movie watching have been implemented. All the pre-watching, while watching and post-watching methods and techniques are aimed at developing students’ communicative competence. The application of such activities as Climax, Role-play, Interactive Language, Daily Life helps to reveal and overcome mistakes of cultural and pragmatic character. All the above-mentioned activities are directed at the assimilation of the language vocabulary with special reference to the given culture. The study dwells into the essence of culture as one of the core concepts of intercultural communication. Sometimes culture is not a priority in the process of language learning which leads to further misunderstandings in real life communication. The application of various methods and techniques with feature films aims at developing students’ cultural competence, their understanding of norms and values of individual cultures. Thus, feature film activities will enable learners to enlarge their knowledge of the particular culture and develop a fundamental insight into intercultural communication.

Keywords: climax, intercultural competence, interactive language, role-play

Procedia PDF Downloads 317
3249 Site Selection of CNG Station by Using FUZZY-AHP Model (Case Study: Gas Zone 4, Tehran City Iran)

Authors: Hamidrza Joodaki

Abstract:

The most complex issue in urban land use planning is site selection that needs to assess the verity of elements and factors. Multi Criteria Decision Making (MCDM) methods are the best approach to deal with complex problems. In this paper, combination of the analytical hierarchy process (AHP) model and FUZZY logic was used as MCDM methods to select the best site for gas station in the 4th gas zone of Tehran. The first and the most important step in FUZZY-AHP model is selection of criteria and sub-criteria. Population, accessibility, proximity and natural disasters were considered as the main criteria in this study. After choosing the criteria, they were weighted based on AHP by EXPERT CHOICE software, and FUZZY logic was used to enhance accuracy and to approach the reality. After these steps, criteria layers were produced and weighted based on FUZZY-AHP model in GIS. Finally, through ARC GIS software, the layers were integrated and the 4th gas zone in TEHRAN was selected as the best site to locate gas station.

Keywords: multiple criteria decision making (MCDM), analytic hierarchy process (AHP), FUZZY logic, geographic information system (GIS)

Procedia PDF Downloads 331
3248 Indoor Real-Time Positioning and Mapping Based on Manhattan Hypothesis Optimization

Authors: Linhang Zhu, Hongyu Zhu, Jiahe Liu

Abstract:

This paper investigated a method of indoor real-time positioning and mapping based on the Manhattan world assumption. In indoor environments, relying solely on feature matching techniques or other geometric algorithms for sensor pose estimation inevitably resulted in cumulative errors, posing a significant challenge to indoor positioning. To address this issue, we adopt the Manhattan world hypothesis to optimize the camera pose algorithm based on feature matching, which improves the accuracy of camera pose estimation. A special processing method was applied to image data frames that conformed to the Manhattan world assumption. When similar data frames appeared subsequently, this could be used to eliminate drift in sensor pose estimation, thereby reducing cumulative errors in estimation and optimizing mapping and positioning. Through experimental verification, it is found that our method achieves high-precision real-time positioning in indoor environments and successfully generates maps of indoor environments. This provides effective technical support for applications such as indoor navigation and robot control.

Keywords: Manhattan world hypothesis, real-time positioning and mapping, feature matching, loopback detection

Procedia PDF Downloads 40
3247 High-Throughput Screening and Selection of Electrogenic Microbial Communities Using Single Chamber Microbial Fuel Cells Based on 96-Well Plate Array

Authors: Lukasz Szydlowski, Jiri Ehlich, Igor Goryanin

Abstract:

We demonstrate a single chamber, 96-well-plated based Microbial Fuel Cell (MFC) with printed, electronic components. This invention is aimed at robust selection of electrogenic microbial community under specific conditions, e.g., electrode potential, pH, nutrient concentration, salt concentration that can be altered within the 96 well plate array. This invention enables robust selection of electrogenic microbial community under the homogeneous reactor, with multiple conditions that can be altered to allow comparative analysis. It can be used as a standalone technique or in conjunction with other selective processes, e.g., flow cytometry, microfluidic-based dielectrophoretic trapping. Mobile conductive elements, like carbon paper, carbon sponge, activated charcoal granules, metal mesh, can be inserted inside to increase the anode surface area in order to collect electrogenic microorganisms and to transfer them into new reactors or for other analytical works. An array of 96-well plate allows this device to be operated by automated pipetting stations.

Keywords: bioengineering, electrochemistry, electromicrobiology, microbial fuel cell

Procedia PDF Downloads 120
3246 Grain Selection in Spiral Grain Selectors during Casting Single-Crystal Turbine Blades

Authors: M. Javahar, H. B. Dong

Abstract:

Single crystal components manufactured using Ni-base Superalloys are routinely used in the hot sections of aero engines and industrial gas turbines due to their outstanding high temperature strength, toughness and resistance to degradation in corrosive and oxidative environments. To control the quality of the single crystal turbine blades, particular attention has been paid to grain selection, which is used to obtain the single crystal morphology from a plethora of columnar grains. For this purpose, different designs of grain selectors are employed and the most common type is the spiral grain selector. A typical spiral grain selector includes a starter block and a spiral (helix) located above. It has been found that the grains with orientation well aligned to the thermal gradient survive in the starter block by competitive grain growth while the selection of the single crystal grain occurs in the spiral part. In the present study, 2D spiral selectors with different geometries were designed and produced using a state-of-the-art Bridgeman Directional Solidification casting furnace to investigate the competitive growth during grain selection in 2d grain selectors. The principal advantage of using a 2-D selector is to facilitate the wax injection process in investment casting by enabling significant degree of automation. The automation within the process can be derived by producing 2D grain selector wax patterns parts using a split die (metal mold model) coupled with wax injection stage. This will not only produce the part with high accuracy but also at an acceptable production rate.

Keywords: grain selector, single crystal, directional solidification, CMSX-4 superalloys, investment casting

Procedia PDF Downloads 559
3245 Visualization-Based Feature Extraction for Classification in Real-Time Interaction

Authors: Ágoston Nagy

Abstract:

This paper introduces a method of using unsupervised machine learning to visualize the feature space of a dataset in 2D, in order to find most characteristic segments in the set. After dimension reduction, users can select clusters by manual drawing. Selected clusters are recorded into a data model that is used for later predictions, based on realtime data. Predictions are made with supervised learning, using Gesture Recognition Toolkit. The paper introduces two example applications: a semantic audio organizer for analyzing incoming sounds, and a gesture database organizer where gestural data (recorded by a Leap motion) is visualized for further manipulation.

Keywords: gesture recognition, machine learning, real-time interaction, visualization

Procedia PDF Downloads 325
3244 Computer-Based Model for Design Selection of Lightning Arrester for 132/33kV Substation

Authors: Uma U. Uma, Uzoechi Laz

Abstract:

Protection of equipment insulation against lightning over voltages and selection of lightning arrester that will discharge at lower voltage level than the voltage required to breakdown the electrical equipment insulation is examined. The objectives of this paper are to design a computer based model using standard equations for the selection of appropriate lightning arrester with the lowest rated surge arrester that will provide adequate protection of equipment insulation and equally have a satisfactory service life when connected to a specified line voltage in power system network. The effectiveness and non-effectiveness of the earthing system of substation determine arrester properties. MATLAB program with GUI (graphic user interphase) its subprogram is used in the development of the model for the determination of required parameters like voltage rating, impulse spark over voltage, power frequency spark over voltage, discharge current, current rating and protection level of lightning arrester of a specified voltage level of a particular line.

Keywords: lightning arrester, GUIs, MatLab program, computer based model

Procedia PDF Downloads 391
3243 Prioritization of Customer Order Selection Factors by Utilizing Conjoint Analysis: A Case Study for a Structural Steel Firm

Authors: Burcu Akyildiz, Cigdem Kadaifci, Y. Ilker Topcu, Burc Ulengin

Abstract:

In today’s business environment, companies should make strategic decisions to gain sustainable competitive advantage. Order selection is a crucial issue among these decisions especially for steel production industry. When the companies allocate a high proportion of their design and production capacities to their ongoing projects, determining which customer order should be chosen among the potential orders without exceeding the remaining capacity is the major critical problem. In this study, it is aimed to identify and prioritize the evaluation factors for the customer order selection problem. Conjoint analysis is used to examine the importance level of each factor which is determined as the potential profit rate per unit of time, the compatibility of potential order with available capacity, the level of potential future order with higher profit, customer credit of future business opportunity, and the negotiability level of production schedule for the order.

Keywords: conjoint analysis, order prioritization, profit management, structural steel firm

Procedia PDF Downloads 365
3242 Managing the Local Manager: A Comparative Study of Core HRM Functions in Multinationals

Authors: Maria Khan

Abstract:

Framing good core Human Resource Management (HRM) functions like recruitment, selection, training and development, which if executed effectively, can become a strategic advantage for a company. HRM policies related to mid-level managers can depend on the type of top management. This may be due to the difference in perception of effective HRM policies of an expatriate and local leadership. This comparative case study assesses how local mid-level managers are managed in leading multinational telecom companies in Pakistan. Core HRM functions related to managers were analysed through field research based on semi-structured interviews with relevant Human Resource Managers. Results suggest that recruitment and selection practices are not too different and are in compliance with best HRM practices. However, there is a difference in the effective implementation of Training and Development policies. Changing global management trends and skill development dictate that MNCs continuously develop the local talent effectively for local and international success.

Keywords: recruitment, selection, training, development, core HRM, human resource management, subsidiary, international staffing, managers, MNC, expatriate

Procedia PDF Downloads 294
3241 Evaluation of QSRR Models by Sum of Ranking Differences Approach: A Case Study of Prediction of Chromatographic Behavior of Pesticides

Authors: Lidija R. Jevrić, Sanja O. Podunavac-Kuzmanović, Strahinja Z. Kovačević

Abstract:

The present study deals with the selection of the most suitable quantitative structure-retention relationship (QSRR) models which should be used in prediction of the retention behavior of basic, neutral, acidic and phenolic pesticides which belong to different classes: fungicides, herbicides, metabolites, insecticides and plant growth regulators. Sum of ranking differences (SRD) approach can give a different point of view on selection of the most consistent QSRR model. SRD approach can be applied not only for ranking of the QSRR models, but also for detection of similarity or dissimilarity among them. Applying the SRD analysis, the most similar models can be found easily. In this study, selection of the best model was carried out on the basis of the reference ranking (“golden standard”) which was defined as the row average values of logarithm of retention time (logtr) defined by high performance liquid chromatography (HPLC). Also, SRD analysis based on experimental logtr values as reference ranking revealed similar grouping of the established QSRR models already obtained by hierarchical cluster analysis (HCA).

Keywords: chemometrics, chromatography, pesticides, sum of ranking differences

Procedia PDF Downloads 352
3240 An Efficient Strategy for Relay Selection in Multi-Hop Communication

Authors: Jung-In Baik, Seung-Jun Yu, Young-Min Ko, Hyoung-Kyu Song

Abstract:

This paper proposes an efficient relaying algorithm to obtain diversity for improving the reliability of a signal. The algorithm achieves time or space diversity gain by multiple versions of the same signal through two routes. Relays are separated between a source and destination. The routes between the source and destination are set adaptive in order to deal with different channels and noises. The routes consist of one or more relays and the source transmits its signal to the destination through the routes. The signals from the relays are combined and detected at the destination. The proposed algorithm provides a better performance than the conventional algorithms in bit error rate (BER).

Keywords: multi-hop, OFDM, relay, relaying selection

Procedia PDF Downloads 421
3239 Online Handwritten Character Recognition for South Indian Scripts Using Support Vector Machines

Authors: Steffy Maria Joseph, Abdu Rahiman V, Abdul Hameed K. M.

Abstract:

Online handwritten character recognition is a challenging field in Artificial Intelligence. The classification success rate of current techniques decreases when the dataset involves similarity and complexity in stroke styles, number of strokes and stroke characteristics variations. Malayalam is a complex south indian language spoken by about 35 million people especially in Kerala and Lakshadweep islands. In this paper, we consider the significant feature extraction for the similar stroke styles of Malayalam. This extracted feature set are suitable for the recognition of other handwritten south indian languages like Tamil, Telugu and Kannada. A classification scheme based on support vector machines (SVM) is proposed to improve the accuracy in classification and recognition of online malayalam handwritten characters. SVM Classifiers are the best for real world applications. The contribution of various features towards the accuracy in recognition is analysed. Performance for different kernels of SVM are also studied. A graphical user interface has developed for reading and displaying the character. Different writing styles are taken for each of the 44 alphabets. Various features are extracted and used for classification after the preprocessing of input data samples. Highest recognition accuracy of 97% is obtained experimentally at the best feature combination with polynomial kernel in SVM.

Keywords: SVM, matlab, malayalam, South Indian scripts, onlinehandwritten character recognition

Procedia PDF Downloads 551
3238 Real-Time Classification of Marbles with Decision-Tree Method

Authors: K. S. Parlak, E. Turan

Abstract:

The separation of marbles according to the pattern quality is a process made according to expert decision. The classification phase is the most critical part in terms of economic value. In this study, a self-learning system is proposed which performs the classification of marbles quickly and with high success. This system performs ten feature extraction by taking ten marble images from the camera. The marbles are classified by decision tree method using the obtained properties. The user forms the training set by training the system at the marble classification stage. The system evolves itself in every marble image that is classified. The aim of the proposed system is to minimize the error caused by the person performing the classification and achieve it quickly.

Keywords: decision tree, feature extraction, k-means clustering, marble classification

Procedia PDF Downloads 358
3237 Hybrid Anomaly Detection Using Decision Tree and Support Vector Machine

Authors: Elham Serkani, Hossein Gharaee Garakani, Naser Mohammadzadeh, Elaheh Vaezpour

Abstract:

Intrusion detection systems (IDS) are the main components of network security. These systems analyze the network events for intrusion detection. The design of an IDS is through the training of normal traffic data or attack. The methods of machine learning are the best ways to design IDSs. In the method presented in this article, the pruning algorithm of C5.0 decision tree is being used to reduce the features of traffic data used and training IDS by the least square vector algorithm (LS-SVM). Then, the remaining features are arranged according to the predictor importance criterion. The least important features are eliminated in the order. The remaining features of this stage, which have created the highest level of accuracy in LS-SVM, are selected as the final features. The features obtained, compared to other similar articles which have examined the selected features in the least squared support vector machine model, are better in the accuracy, true positive rate, and false positive. The results are tested by the UNSW-NB15 dataset.

Keywords: decision tree, feature selection, intrusion detection system, support vector machine

Procedia PDF Downloads 238
3236 Evidence of Natural Selection Footprints among Some African Chicken Breeds and Village Ecotypes

Authors: Ahmed Elbeltagy, Francesca Bertolini, Damarius Fleming, Angelica Van Goor, Chris Ashwell, Carl Schmidt, Donald Kugonza, Susan Lamont, Max Rothschild

Abstract:

The major factor in shaping genomic variation of the African indigenous rural chicken is likely natural selection drives the development genetic footprints in the chicken genomes. To investigate such a hypothesis of a selection footprint, a total of 292 birds were randomly sampled from three indigenous ecotypes from East Africa (Uganda, Rwanda) and North Africa (Egypt) and two registered Egyptian breeds (Fayoumi and Dandarawi), and from the synthetic Kuroiler breed. Samples were genotyped using the Affymetrix 600K Axiom® Array. A total of 526,652 SNPs were utilized in the downstream analysis after quality control measures. The intra-population runs of homozygosity (ROH) that were consensuses in > 50% of individuals of an ecotype or > 75% of a breed were studied. To identify inter-population differentiation due to genetic structure, FST was calculated for North- vs. East- African populations in addition to population-pairwise combinations for overlapping windows (500Kb with an overlap of 250Kb). A total of 28,563 ROH were determined and were classified into three length categories. ROH and Fst detected sweeps were identified on several autosomes. Several genes in these regions are likely to be related to adaptation to local environmental stresses that include high altitude, diseases resistance, poor nutrition, oxidative and heat stresses and were linked to gene ontology terms (GO) related to immune response, oxygen consumption and heme binding, carbohydrate metabolism, oxidation-reduction, and behavior. Results indicated a possible effect of natural selection forces on shaping genomic structure for adaptation to local environmental stresses.

Keywords: African Chicken, runs of homozygosity, FST, selection footprints

Procedia PDF Downloads 293
3235 Using the Smith-Waterman Algorithm to Extract Features in the Classification of Obesity Status

Authors: Rosa Figueroa, Christopher Flores

Abstract:

Text categorization is the problem of assigning a new document to a set of predetermined categories, on the basis of a training set of free-text data that contains documents whose category membership is known. To train a classification model, it is necessary to extract characteristics in the form of tokens that facilitate the learning and classification process. In text categorization, the feature extraction process involves the use of word sequences also known as N-grams. In general, it is expected that documents belonging to the same category share similar features. The Smith-Waterman (SW) algorithm is a dynamic programming algorithm that performs a local sequence alignment in order to determine similar regions between two strings or protein sequences. This work explores the use of SW algorithm as an alternative to feature extraction in text categorization. The dataset used for this purpose, contains 2,610 annotated documents with the classes Obese/Non-Obese. This dataset was represented in a matrix form using the Bag of Word approach. The score selected to represent the occurrence of the tokens in each document was the term frequency-inverse document frequency (TF-IDF). In order to extract features for classification, four experiments were conducted: the first experiment used SW to extract features, the second one used unigrams (single word), the third one used bigrams (two word sequence) and the last experiment used a combination of unigrams and bigrams to extract features for classification. To test the effectiveness of the extracted feature set for the four experiments, a Support Vector Machine (SVM) classifier was tuned using 20% of the dataset. The remaining 80% of the dataset together with 5-Fold Cross Validation were used to evaluate and compare the performance of the four experiments of feature extraction. Results from the tuning process suggest that SW performs better than the N-gram based feature extraction. These results were confirmed by using the remaining 80% of the dataset, where SW performed the best (accuracy = 97.10%, weighted average F-measure = 97.07%). The second best was obtained by the combination of unigrams-bigrams (accuracy = 96.04, weighted average F-measure = 95.97) closely followed by the bigrams (accuracy = 94.56%, weighted average F-measure = 94.46%) and finally unigrams (accuracy = 92.96%, weighted average F-measure = 92.90%).

Keywords: comorbidities, machine learning, obesity, Smith-Waterman algorithm

Procedia PDF Downloads 272
3234 Machine Learning Assisted Prediction of Sintered Density of Binary W(MO) Alloys

Authors: Hexiong Liu

Abstract:

Powder metallurgy is the optimal method for the consolidation and preparation of W(Mo) alloys, which exhibit excellent application prospects at high temperatures. The properties of W(Mo) alloys are closely related to the sintered density. However, controlling the sintered density and porosity of these alloys is still challenging. In the past, the regulation methods mainly focused on time-consuming and costly trial-and-error experiments. In this study, the sintering data for more than a dozen W(Mo) alloys constituted a small-scale dataset, including both solid and liquid phases of sintering. Furthermore, simple descriptors were used to predict the sintered density of W(Mo) alloys based on the descriptor selection strategy and machine learning method (ML), where the ML algorithm included the least absolute shrinkage and selection operator (Lasso) regression, k-nearest neighbor (k-NN), random forest (RF), and multi-layer perceptron (MLP). The results showed that the interpretable descriptors extracted by our proposed selection strategy and the MLP neural network achieved a high prediction accuracy (R>0.950). By further predicting the sintered density of W(Mo) alloys using different sintering processes, the error between the predicted and experimental values was less than 0.063, confirming the application potential of the model.

Keywords: sintered density, machine learning, interpretable descriptors, W(Mo) alloy

Procedia PDF Downloads 54
3233 A Comparative Study on Automatic Feature Classification Methods of Remote Sensing Images

Authors: Lee Jeong Min, Lee Mi Hee, Eo Yang Dam

Abstract:

Geospatial feature extraction is a very important issue in the remote sensing research. In the meantime, the image classification based on statistical techniques, but, in recent years, data mining and machine learning techniques for automated image processing technology is being applied to remote sensing it has focused on improved results generated possibility. In this study, artificial neural network and decision tree technique is applied to classify the high-resolution satellite images, as compared to the MLC processing result is a statistical technique and an analysis of the pros and cons between each of the techniques.

Keywords: remote sensing, artificial neural network, decision tree, maximum likelihood classification

Procedia PDF Downloads 327
3232 Local Directional Encoded Derivative Binary Pattern Based Coral Image Classification Using Weighted Distance Gray Wolf Optimization Algorithm

Authors: Annalakshmi G., Sakthivel Murugan S.

Abstract:

This paper presents a local directional encoded derivative binary pattern (LDEDBP) feature extraction method that can be applied for the classification of submarine coral reef images. The classification of coral reef images using texture features is difficult due to the dissimilarities in class samples. In coral reef image classification, texture features are extracted using the proposed method called local directional encoded derivative binary pattern (LDEDBP). The proposed approach extracts the complete structural arrangement of the local region using local binary batten (LBP) and also extracts the edge information using local directional pattern (LDP) from the edge response available in a particular region, thereby achieving extra discriminative feature value. Typically the LDP extracts the edge details in all eight directions. The process of integrating edge responses along with the local binary pattern achieves a more robust texture descriptor than the other descriptors used in texture feature extraction methods. Finally, the proposed technique is applied to an extreme learning machine (ELM) method with a meta-heuristic algorithm known as weighted distance grey wolf optimizer (GWO) to optimize the input weight and biases of single-hidden-layer feed-forward neural networks (SLFN). In the empirical results, ELM-WDGWO demonstrated their better performance in terms of accuracy on all coral datasets, namely RSMAS, EILAT, EILAT2, and MLC, compared with other state-of-the-art algorithms. The proposed method achieves the highest overall classification accuracy of 94% compared to the other state of art methods.

Keywords: feature extraction, local directional pattern, ELM classifier, GWO optimization

Procedia PDF Downloads 140
3231 Vision Based People Tracking System

Authors: Boukerch Haroun, Luo Qing Sheng, Li Hua Shi, Boukraa Sebti

Abstract:

In this paper we present the design and the implementation of a target tracking system where the target is set to be a moving person in a video sequence. The system can be applied easily as a vision system for mobile robot. The system is composed of two major parts the first is the detection of the person in the video frame using the SVM learning machine based on the “HOG” descriptors. The second part is the tracking of a moving person it’s done by using a combination of the Kalman filter and a modified version of the Camshift tracking algorithm by adding the target motion feature to the color feature, the experimental results had shown that the new algorithm had overcame the traditional Camshift algorithm in robustness and in case of occlusion.

Keywords: camshift algorithm, computer vision, Kalman filter, object tracking

Procedia PDF Downloads 422
3230 Metaheuristic to Align Multiple Sequences

Authors: Lamiche Chaabane

Abstract:

In this study, a new method for solving sequence alignment problem is proposed, which is named ITS (Improved Tabu Search). This algorithm is based on the classical Tabu Search (TS). ITS is implemented in order to obtain results of multiple sequence alignment. Several ideas concerning neighbourhood generation, move selection mechanisms and intensification/diversification strategies for our proposed ITS is investigated. ITS have generated high-quality results in terms of measure of scores in comparison with the classical TS and simple iterative search algorithm.

Keywords: multiple sequence alignment, tabu search, improved tabu search, neighbourhood generation, selection mechanisms

Procedia PDF Downloads 274
3229 Criterion-Referenced Test Reliability through Threshold Loss Agreement: Fuzzy Logic Analysis Approach

Authors: Mohammad Ali Alavidoost, Hossein Bozorgian

Abstract:

Criterion-referenced tests (CRTs) are designed to measure student performance against a fixed set of predetermined criteria or learning standards. The reliability of such tests cannot be based on internal reliability. Threshold loss agreement is one way to calculate the reliability of CRTs. However, the selection of master and non-master in such agreement is determined by the threshold point. The problem is if the threshold point witnesses a minute change, the selection of master and non-master may have a drastic change, leading to the change in reliability results. Therefore, in this study, the Fuzzy logic approach is employed as a remedial procedure for data analysis to obviate the threshold point problem. Forty-one Iranian students were selected; the participants were all between 20 and 30 years old. A quantitative approach was used to address the research questions. In doing so, a quasi-experimental design was utilized since the selection of the participants was not randomized. Based on the Fuzzy logic approach, the threshold point would be more stable during the analysis, resulting in rather constant reliability results and more precise assessment.

Keywords: criterion-referenced tests, threshold loss agreement, threshold point, fuzzy logic approach

Procedia PDF Downloads 342
3228 An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System

Authors: Ben Soltane Cheima, Ittansa Yonas Kelbesa

Abstract:

Speaker Identification (SI) is the task of establishing identity of an individual based on his/her voice characteristics. The SI task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker specific feature parameters from the speech and generates speaker models accordingly. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Even though performance of speaker identification systems has improved due to recent advances in speech processing techniques, there is still need of improvement. In this paper, a Closed-Set Tex-Independent Speaker Identification System (CISI) based on a Multiple Classifier System (MCS) is proposed, using Mel Frequency Cepstrum Coefficient (MFCC) as feature extraction and suitable combination of vector quantization (VQ) and Gaussian Mixture Model (GMM) together with Expectation Maximization algorithm (EM) for speaker modeling. The use of Voice Activity Detector (VAD) with a hybrid approach based on Short Time Energy (STE) and Statistical Modeling of Background Noise in the pre-processing step of the feature extraction yields a better and more robust automatic speaker identification system. Also investigation of Linde-Buzo-Gray (LBG) clustering algorithm for initialization of GMM, for estimating the underlying parameters, in the EM step improved the convergence rate and systems performance. It also uses relative index as confidence measures in case of contradiction in identification process by GMM and VQ as well. Simulation results carried out on voxforge.org speech database using MATLAB highlight the efficacy of the proposed method compared to earlier work.

Keywords: feature extraction, speaker modeling, feature matching, Mel frequency cepstrum coefficient (MFCC), Gaussian mixture model (GMM), vector quantization (VQ), Linde-Buzo-Gray (LBG), expectation maximization (EM), pre-processing, voice activity detection (VAD), short time energy (STE), background noise statistical modeling, closed-set tex-independent speaker identification system (CISI)

Procedia PDF Downloads 284