Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3618

Search results for: variable selection

3618 A Two-Stage Bayesian Variable Selection Method with the Extension of Lasso for Geo-Referenced Data

Authors: Georgiana Onicescu, Yuqian Shen

Abstract:

Due to the complex nature of geo-referenced data, multicollinearity of the risk factors in public health spatial studies is a commonly encountered issue, which leads to low parameter estimation accuracy because it inflates the variance in the regression analysis. To address this issue, we proposed a two-stage variable selection method by extending the least absolute shrinkage and selection operator (Lasso) to the Bayesian spatial setting, investigating the impact of risk factors to health outcomes. Specifically, in stage I, we performed the variable selection using Bayesian Lasso and several other variable selection approaches. Then, in stage II, we performed the model selection with only the selected variables from stage I and compared again the methods. To evaluate the performance of the two-stage variable selection methods, we conducted a simulation study with different distributions for the risk factors, using geo-referenced count data as the outcome and Michigan as the research region. We considered the cases when all candidate risk factors are independently normally distributed, or follow a multivariate normal distribution with different correlation levels. Two other Bayesian variable selection methods, Binary indicator, and the combination of Binary indicator and Lasso were considered and compared as alternative methods. The simulation results indicated that the proposed two-stage Bayesian Lasso variable selection method has the best performance for both independent and dependent cases considered. When compared with the one-stage approach, and the other two alternative methods, the two-stage Bayesian Lasso approach provides the highest estimation accuracy in all scenarios considered.

Keywords: Lasso, Bayesian analysis, spatial analysis, variable selection

Procedia PDF Downloads 57
3617 Bayesian Variable Selection in Quantile Regression with Application to the Health and Retirement Study

Authors: Priya Kedia, Kiranmoy Das

Abstract:

There is a rich literature on variable selection in regression setting. However, most of these methods assume normality for the response variable under consideration for implementing the methodology and establishing the statistical properties of the estimates. In many real applications, the distribution for the response variable may be non-Gaussian, and one might be interested in finding the best subset of covariates at some predetermined quantile level. We develop dynamic Bayesian approach for variable selection in quantile regression framework. We use a zero-inflated mixture prior for the regression coefficients, and consider the asymmetric Laplace distribution for the response variable for modeling different quantiles of its distribution. An efficient Gibbs sampler is developed for our computation. Our proposed approach is assessed through extensive simulation studies, and real application of the proposed approach is also illustrated. We consider the data from health and retirement study conducted by the University of Michigan, and select the important predictors when the outcome of interest is out-of-pocket medical cost, which is considered as an important measure for financial risk. Our analysis finds important predictors at different quantiles of the outcome, and thus enhance our understanding on the effects of different predictors on the out-of-pocket medical cost.

Keywords: variable selection, quantile regression, Gibbs sampler, asymmetric Laplace distribution

Procedia PDF Downloads 67
3616 Efficient Relay Selection Scheme Utilizing OVSF Code in Cooperative Communication System

Authors: Yeong-Seop Ahn, Myoung-Jin Kim, Young-Min Ko, Hyoung-Kyu Song

Abstract:

This paper proposes a relay selection scheme utilizing an orthogonal variable spreading factor (OVSF) code in a cooperative communication. The relay selection scheme influences on the communication performance in the cooperative communication. Conventional relay selection schemes such as the best harmonic mean relay selection scheme or the threshold-based relay selection scheme should know information such as channel state information (CSI) in advance. The proposed relay selection scheme does not require information in advance by using a reference signal utilizing the OVSF code. The simulation result shows that bit error rate (BER) performance of proposed relay selection scheme is similar to the best harmonic mean relay selection scheme that is known as one of the optimal relay selection schemes.

Keywords: cooperative communication, relay selection, OFDM, OVSF code

Procedia PDF Downloads 515
3615 Robust Variable Selection Based on Schwarz Information Criterion for Linear Regression Models

Authors: Shokrya Saleh A. Alshqaq, Abdullah Ali H. Ahmadini

Abstract:

The Schwarz information criterion (SIC) is a popular tool for selecting the best variables in regression datasets. However, SIC is defined using an unbounded estimator, namely, the least-squares (LS), which is highly sensitive to outlying observations, especially bad leverage points. A method for robust variable selection based on SIC for linear regression models is thus needed. This study investigates the robustness properties of SIC by deriving its influence function and proposes a robust SIC based on the MM-estimation scale. The aim of this study is to produce a criterion that can effectively select accurate models in the presence of vertical outliers and high leverage points. The advantages of the proposed robust SIC is demonstrated through a simulation study and an analysis of a real dataset.

Keywords: influence function, robust variable selection, robust regression, Schwarz information criterion

Procedia PDF Downloads 63
3614 Weighted Rank Regression with Adaptive Penalty Function

Authors: Kang-Mo Jung

Abstract:

The use of regularization for statistical methods has become popular. The least absolute shrinkage and selection operator (LASSO) framework has become the standard tool for sparse regression. However, it is well known that the LASSO is sensitive to outliers or leverage points. We consider a new robust estimation which is composed of the weighted loss function of the pairwise difference of residuals and the adaptive penalty function regulating the tuning parameter for each variable. Rank regression is resistant to regression outliers, but not to leverage points. By adopting a weighted loss function, the proposed method is robust to leverage points of the predictor variable. Furthermore, the adaptive penalty function gives us good statistical properties in variable selection such as oracle property and consistency. We develop an efficient algorithm to compute the proposed estimator using basic functions in program R. We used an optimal tuning parameter based on the Bayesian information criterion (BIC). Numerical simulation shows that the proposed estimator is effective for analyzing real data set and contaminated data.

Keywords: adaptive penalty function, robust penalized regression, variable selection, weighted rank regression

Procedia PDF Downloads 352
3613 Design of Target Selection for Pedestrian Autonomous Emergency Braking System

Authors: Tao Song, Hao Cheng, Guangfeng Tian, Chuang Xu

Abstract:

An autonomous emergency braking system is an advanced driving assistance system that enables vehicle collision avoidance and pedestrian collision avoidance to improve vehicle safety. At present, because the pedestrian target is small, and the mobility is large, the pedestrian AEB system is faced with more technical difficulties and higher functional requirements. In this paper, a method of pedestrian target selection based on a variable width funnel is proposed. Based on the current position and predicted position of pedestrians, the relative position of vehicle and pedestrian at the time of collision is calculated, and different braking strategies are adopted according to the hazard level of pedestrian collisions. In the CNCAP standard operating conditions, comparing the method of considering only the current position of pedestrians and the method of considering pedestrian prediction position, as well as the method based on fixed width funnel and variable width funnel, the results show that, based on variable width funnel, the choice of pedestrian target will be more accurate and the opportunity of the intervention of AEB system will be more reasonable by considering the predicted position of the pedestrian target and vehicle's lateral motion.

Keywords: automatic emergency braking system, pedestrian target selection, TTC, variable width funnel

Procedia PDF Downloads 71
3612 Supplier Selection by Considering Cost and Reliability

Authors: K. -H. Yang

Abstract:

Supplier selection problem is one of the important issues of supply chain problems. Two categories of methodologies include qualitative and quantitative approaches which can be applied to supplier selection problems. However, due to the complexities of the problem and lacking of reliable and quantitative data, qualitative approaches are more than quantitative approaches. This study considers operational cost and supplier’s reliability factor and solves the problem by using a quantitative approach. A mixed integer programming model is the primary analytic tool. Analyses of different scenarios with variable cost and reliability structures show that the effectiveness of this approach to the supplier selection problem.

Keywords: mixed integer programming, quantitative approach, supplier’s reliability, supplier selection

Procedia PDF Downloads 298
3611 On-Line Data-Driven Multivariate Statistical Prediction Approach to Production Monitoring

Authors: Hyun-Woo Cho

Abstract:

Detection of incipient abnormal events in production processes is important to improve safety and reliability of manufacturing operations and reduce losses caused by failures. The construction of calibration models for predicting faulty conditions is quite essential in making decisions on when to perform preventive maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of process measurement data. The calibration model is used to predict faulty conditions from historical reference data. This approach utilizes variable selection techniques, and the predictive performance of several prediction methods are evaluated using real data. The results shows that the calibration model based on supervised probabilistic model yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.

Keywords: calibration model, monitoring, quality improvement, feature selection

Procedia PDF Downloads 288
3610 Explanatory Variables for Crash Injury Risk Analysis

Authors: Guilhermina Torrao

Abstract:

An extensive number of studies have been conducted to determine the factors which influence crash injury risk (CIR); however, uncertainties inherent to selected variables have been neglected. A review of existing literature is required to not only obtain an overview of the variables and measures but also ascertain the implications when comparing studies without a systematic view of variable taxonomy. Therefore, the aim of this literature review is to examine and report on peer-reviewed studies in the field of crash analysis and to understand the implications of broad variations in variable selection in CIR analysis. The objective of this study is to demonstrate the variance in variable selection and classification when modeling injury risk involving occupants of light vehicles by presenting an analytical review of the literature. Based on data collected from 64 journal publications reported over the past 21 years, the analytical review discusses the variables selected by each study across an organized list of predictors for CIR analysis and provides a better understanding of the contribution of accident and vehicle factors to injuries acquired by occupants of light vehicles. A cross-comparison analysis demonstrates that almost half the studies (48%) did not consider vehicle design specifications (e.g., vehicle weight), whereas, for those that did, the vehicle age/model year was the most selected explanatory variable used by 41% of the literature studies. For those studies that included speed risk factor in their analyses, the majority (64%) used the legal speed limit data as a ‘proxy’ of vehicle speed at the moment of a crash, imposing limitations for CIR analysis and modeling. Despite the proven efficiency of airbags in minimizing injury impact following a crash, only 22% of studies included airbag deployment data. A major contribution of this study is to highlight the uncertainty linked to explanatory variable selection and identify opportunities for improvements when performing future studies in the field of road injuries.

Keywords: crash, exploratory, injury, risk, variables, vehicle

Procedia PDF Downloads 35
3609 Ground Motion Modeling Using the Least Absolute Shrinkage and Selection Operator

Authors: Yildiz Stella Dak, Jale Tezcan

Abstract:

Ground motion models that relate a strong motion parameter of interest to a set of predictive seismological variables describing the earthquake source, the propagation path of the seismic wave, and the local site conditions constitute a critical component of seismic hazard analyses. When a sufficient number of strong motion records are available, ground motion relations are developed using statistical analysis of the recorded ground motion data. In regions lacking a sufficient number of recordings, a synthetic database is developed using stochastic, theoretical or hybrid approaches. Regardless of the manner the database was developed, ground motion relations are developed using regression analysis. Development of a ground motion relation is a challenging process which inevitably requires the modeler to make subjective decisions regarding the inclusion criteria of the recordings, the functional form of the model and the set of seismological variables to be included in the model. Because these decisions are critically important to the validity and the applicability of the model, there is a continuous interest on procedures that will facilitate the development of ground motion models. This paper proposes the use of the Least Absolute Shrinkage and Selection Operator (LASSO) in selecting the set predictive seismological variables to be used in developing a ground motion relation. The LASSO can be described as a penalized regression technique with a built-in capability of variable selection. Similar to the ridge regression, the LASSO is based on the idea of shrinking the regression coefficients to reduce the variance of the model. Unlike ridge regression, where the coefficients are shrunk but never set equal to zero, the LASSO sets some of the coefficients exactly to zero, effectively performing variable selection. Given a set of candidate input variables and the output variable of interest, LASSO allows ranking the input variables in terms of their relative importance, thereby facilitating the selection of the set of variables to be included in the model. Because the risk of overfitting increases as the ratio of the number of predictors to the number of recordings increases, selection of a compact set of variables is important in cases where a small number of recordings are available. In addition, identification of a small set of variables can improve the interpretability of the resulting model, especially when there is a large number of candidate predictors. A practical application of the proposed approach is presented, using more than 600 recordings from the National Geospatial-Intelligence Agency (NGA) database, where the effect of a set of seismological predictors on the 5% damped maximum direction spectral acceleration is investigated. The set of candidate predictors considered are Magnitude, Rrup, Vs30. Using LASSO, the relative importance of the candidate predictors has been ranked. Regression models with increasing levels of complexity were constructed using one, two, three, and four best predictors, and the models’ ability to explain the observed variance in the target variable have been compared. The bias-variance trade-off in the context of model selection is discussed.

Keywords: ground motion modeling, least absolute shrinkage and selection operator, penalized regression, variable selection

Procedia PDF Downloads 254
3608 A Comparative Study of Additive and Nonparametric Regression Estimators and Variable Selection Procedures

Authors: Adriano Z. Zambom, Preethi Ravikumar

Abstract:

One of the biggest challenges in nonparametric regression is the curse of dimensionality. Additive models are known to overcome this problem by estimating only the individual additive effects of each covariate. However, if the model is misspecified, the accuracy of the estimator compared to the fully nonparametric one is unknown. In this work the efficiency of completely nonparametric regression estimators such as the Loess is compared to the estimators that assume additivity in several situations, including additive and non-additive regression scenarios. The comparison is done by computing the oracle mean square error of the estimators with regards to the true nonparametric regression function. Then, a backward elimination selection procedure based on the Akaike Information Criteria is proposed, which is computed from either the additive or the nonparametric model. Simulations show that if the additive model is misspecified, the percentage of time it fails to select important variables can be higher than that of the fully nonparametric approach. A dimension reduction step is included when nonparametric estimator cannot be computed due to the curse of dimensionality. Finally, the Boston housing dataset is analyzed using the proposed backward elimination procedure and the selected variables are identified.

Keywords: additive model, nonparametric regression, variable selection, Akaike Information Criteria

Procedia PDF Downloads 186
3607 Detecting Potential Biomarkers for Ulcerative Colitis Using Hybrid Feature Selection

Authors: Mustafa Alshawaqfeh, Bilal Wajidy, Echin Serpedin, Jan Suchodolski

Abstract:

Inflammatory Bowel disease (IBD) is a disease of the colon with characteristic inflammation. Clinically IBD is detected using laboratory tests (blood and stool), radiology tests (imaging using CT, MRI), capsule endoscopy and endoscopy. There are two variants of IBD referred to as Ulcerative Colitis (UC) and Crohn’s disease. This study employs a hybrid feature selection method that combines a correlation-based variable ranking approach with exhaustive search wrapper methods in order to find potential biomarkers for UC. The proposed biomarkers presented accurate discriminatory power thereby identifying themselves to be possible ingredients to UC therapeutics.

Keywords: ulcerative colitis, biomarker detection, feature selection, inflammatory bowel disease (IBD)

Procedia PDF Downloads 315
3606 Merit Measures and Validation in Employee Evaluation and Selection

Authors: Wilson P. R. Malebye, Solly M. Seeletse

Abstract:

Applicants for space in selection problems are usually compared subjectively, and the selection made are not reliable and often cannot be verified scientifically. The paper illustrates objective selection by involving a mathematical measure in selecting a candidate applying for a job, and then using other two independent measures, validates the choice made. The scientific process followed is SToR (SAW, TOPSIS, WP) in which Simple Additive Weighting (SAW) is used to select, and the TOPSIS (technique for order preference by similarity to ideal solution) and weighted product (WP) are used to validate. A practical exercise was obtained from a factual selection problem in a recruitment task undertaken in an organization in which the authors consulted, and their Human Resources (HR) department wanted to check if their selection was justifiable. The result was that our approach was consistent and convincing to that HR, and theirs was not because our selection was satisfactory while theirs could not be corroborated using any method.

Keywords: candidate selection, SToR, SW, TOPSIS, WP

Procedia PDF Downloads 249
3605 Optimal Selection of Replenishment Policies Using Distance Based Approach

Authors: Amit Gupta, Deepak Juneja, Sorabh Gupta

Abstract:

This paper presents a model based on distance based approach (DBA) method employed for evaluation, selection, and ranking of replenishment policies for a single location inventory, which hitherto not developed in the literature. This work recognizes the significance of the selection problem, identifies the selection criteria, the relative importance of selection criteria for this research problem. The developed model is capable of comparing any number of alternate inventory policies for various selection criteria where cardinal values are assigned as a rating to alternate inventory polices for selection criteria and weights of selection criteria. The illustrated example demonstrates the model and presents the result in terms of ranking of replenishment policies.

Keywords: DBA, ranking, replenishment policies, selection criteria

Procedia PDF Downloads 69
3604 The Effect of Feature Selection on Pattern Classification

Authors: Chih-Fong Tsai, Ya-Han Hu

Abstract:

The aim of feature selection (or dimensionality reduction) is to filter out unrepresentative features (or variables) making the classifier perform better than the one without feature selection. Since there are many well-known feature selection algorithms, and different classifiers based on different selection results may perform differently, very few studies consider examining the effect of performing different feature selection algorithms on the classification performances by different classifiers over different types of datasets. In this paper, two widely used algorithms, which are the genetic algorithm (GA) and information gain (IG), are used to perform feature selection. On the other hand, three well-known classifiers are constructed, which are the CART decision tree (DT), multi-layer perceptron (MLP) neural network, and support vector machine (SVM). Based on 14 different types of datasets, the experimental results show that in most cases IG is a better feature selection algorithm than GA. In addition, the combinations of IG with DT and IG with SVM perform best and second best for small and large scale datasets.

Keywords: data mining, feature selection, pattern classification, dimensionality reduction

Procedia PDF Downloads 566
3603 Determining of Importance Level of Factors Affecting Job Selection with the Method of AHP

Authors: Nurullah Ekmekci, Ömer Akkaya, Kazım Karaboğa, Mahmut Tekin

Abstract:

Job selection is one of the most important decisions that affect their lives in the name of being more useful to themselves and the society. There are many criteria to consider in the job selection. The amount of criteria in the job selection makes it a multi-criteria decision-making (MCDM) problem. In this study; job selection has been discussed as multi-criteria decision-making problem and has been solved by Analytic Hierarchy Process (AHP), one of the multi-criteria decision making methods. A survey, contains 5 different job selection criteria (finding a job friendliness, salary status, job , social security, work in the community deems reputation and business of the degree of difficulty) within many job selection criteria and 4 different job alternative (being academician, working at the civil service, working at the private sector and working at in their own business), has been conducted to the students of Selcuk University Faculty of Economics and Administrative Sciences. As a result of pairwise comparisons, the highest weighted criteria in the job selection and the most coveted job preferences were identified.

Keywords: analytical hierarchy process, job selection, multi-criteria, decision making

Procedia PDF Downloads 312
3602 On Optimum Stratification

Authors: M. G. M. Khan, V. D. Prasad, D. K. Rao

Abstract:

In this manuscript, we discuss the problem of determining the optimum stratification of a study (or main) variable based on the auxiliary variable that follows a uniform distribution. If the stratification of survey variable is made using the auxiliary variable it may lead to substantial gains in precision of the estimates. This problem is formulated as a Nonlinear Programming Problem (NLPP), which turn out to multistage decision problem and is solved using dynamic programming technique.

Keywords: auxiliary variable, dynamic programming technique, nonlinear programming problem, optimum stratification, uniform distribution

Procedia PDF Downloads 192
3601 Selection Standards for National Teams: Theory and Practice

Authors: Alexey Kulik

Abstract:

This article deals with selection standards for national sport teams. The author examines the legal framework for selection criteria and suggests using the most honest criteria.

Keywords: national teams, standards of forming teams, selection standards, sport legislations

Procedia PDF Downloads 422
3600 The Role of Recruitment and Selection in Financial Performance of Enterprises in Kosovo

Authors: Arta Jashari, Enver Kutllovci

Abstract:

Abstract— The purpose of this study is to examine the relationship of recruitment and selection practice and performance in medium service enterprises in Kosovo. A total of 110 managers from public and private sector was analyzed. Our empirical results show that enterprises in Kosovo use recruitment and selection practice and they know how important is to have the right people with skills and knowledge accordingly with the job requirements. The outcome of Pearson correlation analysis provides evidence that recruitment and selection practice, positively and significantly influence the financial performance. Also, our results show a significant relationship between the education of managers and the use of the recruitment and selection practice. From our results we can conclude and suggest that with a good recruiting and selection, the organization will fill with a group of potentially qualified candidates who will be able to fulfill the enterprises objective.

Keywords: Human Resource, Kosovo, Recruitment and Selection, Performance

Procedia PDF Downloads 64
3599 Hybrid Feature Selection Method for Sentiment Classification of Movie Reviews

Authors: Vishnu Goyal, Basant Agarwal

Abstract:

Sentiment analysis research provides methods for identifying the people’s opinion written in blogs, reviews, social networking websites etc. Sentiment analysis is to understand what opinion people have about any given entity, object or thing. Sentiment analysis research can be broadly categorised into three types of approaches i.e. semantic orientation, machine learning and lexicon based approaches. Feature selection methods improve the performance of the machine learning algorithms by eliminating the irrelevant features. Information gain feature selection method has been considered best method for sentiment analysis; however, it has the drawback of selection of threshold. Therefore, in this paper, we propose a hybrid feature selection methods comprising of information gain and proposed feature selection method. Initially, features are selected using Information Gain (IG) and further more noisy features are eliminated using the proposed feature selection method. Experimental results show the efficiency of the proposed feature selection methods.

Keywords: feature selection, sentiment analysis, hybrid feature selection

Procedia PDF Downloads 224
3598 Competence-Based Human Resources Selection and Training: Making Decisions

Authors: O. Starineca, I. Voronchuk

Abstract:

Human Resources (HR) selection and training have various implementation possibilities depending on an organization’s abilities and peculiarities. We propose to base HR selection and training decisions about on a competence-based approach. HR selection and training of employees are topical as there is room for improvement in this field; therefore, the aim of the research is to propose rational decision-making approaches for an organization HR selection and training choice. Our proposals are based on the training development and competence-based selection approaches created within previous researches i.e. Analytic-Hierarchy Process (AHP) and Linear Programming. Literature review on non-formal education, competence-based selection, AHP form our theoretical background. Some educational service providers in Latvia offer employees training, e.g. motivation, computer skills, accounting, law, ethics, stress management, etc. that are topical for Public Administration. Competence-based approach is a rational base for rational decision-making in both HR selection and considering HR training.

Keywords: competence-based selection, human resource, training, decision-making

Procedia PDF Downloads 243
3597 Supplier Selection by Bi-Objectives Mixed Integer Program Approach

Authors: K.-H. Yang

Abstract:

In the past, there was a lot of excellent research studies conducted on topics related to supplier selection. Because the considered factors of supplier selection are complicated and difficult to be quantified, most researchers deal supplier selection issues by qualitative approaches. Compared to qualitative approaches, quantitative approaches are less applicable in the real world. This study tried to apply the quantitative approach to study a supplier selection problem with considering operation cost and delivery reliability. By those factors, this study applies Normalized Normal Constraint Method to solve the dual objectives mixed integer program of the supplier selection problem.

Keywords: bi-objectives MIP, normalized normal constraint method, supplier selection, quantitative approach

Procedia PDF Downloads 333
3596 Conceptual Design of a Customer Friendly Variable Volume and Variable Spinning Speed Washing Machine

Authors: C. A. Akaash Emmanuel Raj, V. R. Sanal Kumar

Abstract:

In this paper using smart materials we have proposed a specially manufactured variable volume spin tub for loading clothes for negating the vibration to a certain extent for getting better operating performance. Additionally, we have recommended a variable spinning speed rotor for handling varieties of garments for an efficient washing, aiming for increasing the life span of both the garments and the machine. As a part of the conflicting dynamic constraints and demands of the customer friendly design optimization of a lucrative and cosmetic washing machine we have proposed a drier and a desalination system capable to supply desirable heat and a pleasing fragrance to the garments. We thus concluded that while incorporating variable volume and variable spinning speed tub integrated with a drier and desalination system, the washing machine could meet the varieties of domestic requirements of the customers cost-effectively.

Keywords: customer friendly washing machine, drier design, quick cloth cleaning, variable tub volume washing machine, variable spinning speed washing machine

Procedia PDF Downloads 162
3595 Item-Trait Pattern Recognition of Replenished Items in Multidimensional Computerized Adaptive Testing

Authors: Jianan Sun, Ziwen Ye

Abstract:

Multidimensional computerized adaptive testing (MCAT) is a popular research topic in psychometrics. It is important for practitioners to clearly know the item-trait patterns of administered items when a test like MCAT is operated. Item-trait pattern recognition refers to detecting which latent traits in a psychological test are measured by each of the specified items. If the item-trait patterns of the replenished items in MCAT item pool are well detected, the interpretability of the items can be improved, which can further promote the abilities of the examinees who attending the MCAT to be accurately estimated. This research explores to solve the item-trait pattern recognition problem of the replenished items in MCAT item pool from the perspective of statistical variable selection. The popular multidimensional item response theory model, multidimensional two-parameter logistic model, is assumed to fit the response data of MCAT. The proposed method uses the least absolute shrinkage and selection operator (LASSO) to detect item-trait patterns of replenished items based on the essential information of item responses and ability estimates of examinees collected from a designed MCAT procedure. Several advantages of the proposed method are outlined. First, the proposed method does not strictly depend on the relative order between the replenished items and the selected operational items, so it allows the replenished items to be mixed into the operational items in reasonable order such as considering content constraints or other test requirements. Second, the LASSO used in this research improves the interpretability of the multidimensional replenished items in MCAT. Third, the proposed method can exert the advantage of shrinkage method idea for variable selection, so it can help to check item quality and key dimension features of replenished items and saves more costs of time and labors in response data collection than traditional factor analysis method. Moreover, the proposed method makes sure the dimensions of replenished items are recognized to be consistent with the dimensions of operational items in MCAT item pool. Simulation studies are conducted to investigate the performance of the proposed method under different conditions for varying dimensionality of item pool, latent trait correlation, item discrimination, test lengths and item selection criteria in MCAT. Results show that the proposed method can accurately detect the item-trait patterns of the replenished items in the two-dimensional and the three-dimensional item pool. Selecting enough operational items from the item pool consisting of high discriminating items by Bayesian A-optimality in MCAT can improve the recognition accuracy of item-trait patterns of replenished items for the proposed method. The pattern recognition accuracy for the conditions with correlated traits is better than those with independent traits especially for the item pool consisting of comparatively low discriminating items. To sum up, the proposed data-driven method based on the LASSO can accurately and efficiently detect the item-trait patterns of replenished items in MCAT.

Keywords: item-trait pattern recognition, least absolute shrinkage and selection operator, multidimensional computerized adaptive testing, variable selection

Procedia PDF Downloads 55
3594 A Case-Based Reasoning-Decision Tree Hybrid System for Stock Selection

Authors: Yaojun Wang, Yaoqing Wang

Abstract:

Stock selection is an important decision-making problem. Many machine learning and data mining technologies are employed to build automatic stock-selection system. A profitable stock-selection system should consider the stock’s investment value and the market timing. In this paper, we present a hybrid system including both engage for stock selection. This system uses a case-based reasoning (CBR) model to execute the stock classification, uses a decision-tree model to help with market timing and stock selection. The experiments show that the performance of this hybrid system is better than that of other techniques regarding to the classification accuracy, the average return and the Sharpe ratio.

Keywords: case-based reasoning, decision tree, stock selection, machine learning

Procedia PDF Downloads 305
3593 Variable Selection in a Data Envelopment Analysis Model by Multiple Proportions Comparison

Authors: Jirawan Jitthavech, Vichit Lorchirachoonkul

Abstract:

A statistical procedure using multiple comparisons test for proportions is proposed for variable selection in a data envelopment analysis (DEA) model. The test statistic in the multiple comparisons is the proportion of efficient decision making units (DMUs) in a DEA model. Three methods of multiple comparisons test for proportions: multiple Z tests with Bonferroni correction, multiple tests in 2Xc crosstabulation and the Marascuilo procedure, are used in the proposed statistical procedure of iteratively eliminating the variables in a backward manner. Two simulation populations of moderately and lowly correlated variables are used to compare the results of the statistical procedure using three methods of multiple comparisons test for proportions with the hypothesis testing of the efficiency contribution measure. From the simulation results, it can be concluded that the proposed statistical procedure using multiple Z tests for proportions with Bonferroni correction clearly outperforms the proposed statistical procedure using the remaining two methods of multiple comparisons and the hypothesis testing of the efficiency contribution measure.

Keywords: Bonferroni correction, efficient DMUs, Marascuilo procedure, Pastor et al. method, 2xc crosstabulation

Procedia PDF Downloads 232
3592 Processing Big Data: An Approach Using Feature Selection

Authors: Nikat Parveen, M. Ananthi

Abstract:

Big data is one of the emerging technology, which collects the data from various sensors and those data will be used in many fields. Data retrieval is one of the major issue where there is a need to extract the exact data as per the need. In this paper, large amount of data set is processed by using the feature selection. Feature selection helps to choose the data which are actually needed to process and execute the task. The key value is the one which helps to point out exact data available in the storage space. Here the available data is streamed and R-Center is proposed to achieve this task.

Keywords: big data, key value, feature selection, retrieval, performance

Procedia PDF Downloads 175
3591 A Relational Case-Based Reasoning Framework for Project Delivery System Selection

Authors: Yang Cui, Yong Qiang Chen

Abstract:

An appropriate project delivery system (PDS) is crucial to the success of a construction project. Case-based reasoning (CBR) is a useful support for PDS selection. However, the traditional CBR approach represents cases as attribute-value vectors without taking relations among attributes into consideration, and could not calculate the similarity when the structures of cases are not strictly same. Therefore, this paper solves this problem by adopting the relational case-based reasoning (RCBR) approach for PDS selection, considering both the structural similarity and feature similarity. To develop the feature terms of the construction projects, the criteria and factors governing PDS selection process are first identified. Then, feature terms for the construction projects are developed. Finally, the mechanism of similarity calculation and a case study indicate how RCBR works for PDS selection. The adoption of RCBR in PDS selection expands the scope of application of traditional CBR method and improves the accuracy of the PDS selection system.

Keywords: relational cased-based reasoning, case-based reasoning, project delivery system, PDS selection

Procedia PDF Downloads 350
3590 Clustering-Based Infinite Feature Selection Method

Authors: S. F. Hassani Ziabari, S. Eskandari, M. Salahi

Abstract:

Infinite Feature Selection (IFS) algorithm is an efficient feature selection algorithm that selects a subset of features of all sizes (including infinity). In this paper, we present an improved version of it, called clustering IFS (CIFS), by clustering the dataset in advance. To do so, first, we apply the K-means algorithm to cluster the dataset, then we apply IFS. In the CIFS method, the spatial and time complexities are reduced compared to the IFS method. Experimental results on six datasets show the superiority of CIFS compared to IFS in terms of accuracy, running time, and memory consumption.

Keywords: feature selection, infinite feature selection, clustering, graph

Procedia PDF Downloads 8
3589 Efficient Single Relay Selection Scheme for Cooperative Communication

Authors: Sung-Bok Choi, Hyun-Jun Shin, Hyoung-Kyu Song

Abstract:

This paper proposes a single relay selection scheme in cooperative communication. Decode and forward scheme is considered when a source node wants to cooperate with a single relay for data transmission. To use the proposed single relay selection scheme, the source node make a little different pattern signal which is not complex pattern and broadcasts it. The proposed scheme does not require the channel state information between the source node and candidates of the relay during the relay selection. Therefore, it is able to be used in many fields.

Keywords: relay selection, cooperative communication, df, channel codes

Procedia PDF Downloads 473