Search results for: unsupervised feature ranking
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1987

Search results for: unsupervised feature ranking

1807 Pantograph-Catenary Contact Force: Features Evaluation for Catenary Diagnostics

Authors: Mehdi Brahimi, Kamal Medjaher, Noureddine Zerhouni, Mohammed Leouatni

Abstract:

The Prognostics and Health Management is a system engineering discipline which provides solutions and models to the implantation of a predictive maintenance. The approach is based on extracting useful information from monitoring data to assess the “health” state of an industrial equipment or an asset. In this paper, we examine multiple extracted features from Pantograph-Catenary contact force in order to select the most relevant ones to achieve a diagnostics function. The feature extraction methodology is based on simulation data generated thanks to a Pantograph-Catenary simulation software called INPAC and measurement data. The feature extraction method is based on both statistical and signal processing analyses. The feature selection method is based on statistical criteria.

Keywords: catenary/pantograph interaction, diagnostics, Prognostics and Health Management (PHM), quality of current collection

Procedia PDF Downloads 260
1806 Simple Multiple-Attribute Rating Technique for Optimal Decision-Making Model on Selecting Best Spiker of World Grand Prix

Authors: Chen Chih-Cheng, Chen I-Cheng, Lee Yung-Tan, Kuo Yen-Whea, Yu Chin-Hung

Abstract:

The purpose of this study is to construct a model for best spike player selection in a top volleyball tournament of the world. Data consisted of the records of 2013 World Grand Prix declared by International Volleyball Federation (FIVB). Simple Multiple-Attribute Rating Technique (SMART) was used for optimal decision-making model on the best spike player selection. The research results showed that the best spike player ranking by SMART is different than the ranking by FIVB. The results demonstrated the effectiveness and feasibility of the proposed model.

Keywords: simple multiple-attribute rating technique, World Grand Prix, best spike player, International Volleyball Federation

Procedia PDF Downloads 442
1805 A Robust Spatial Feature Extraction Method for Facial Expression Recognition

Authors: H. G. C. P. Dinesh, G. Tharshini, M. P. B. Ekanayake, G. M. R. I. Godaliyadda

Abstract:

This paper presents a new spatial feature extraction method based on principle component analysis (PCA) and Fisher Discernment Analysis (FDA) for facial expression recognition. It not only extracts reliable features for classification, but also reduces the feature space dimensions of pattern samples. In this method, first each gray scale image is considered in its entirety as the measurement matrix. Then, principle components (PCs) of row vectors of this matrix and variance of these row vectors along PCs are estimated. Therefore, this method would ensure the preservation of spatial information of the facial image. Afterwards, by incorporating the spectral information of the eigen-filters derived from the PCs, a feature vector was constructed, for a given image. Finally, FDA was used to define a set of basis in a reduced dimension subspace such that the optimal clustering is achieved. The method of FDA defines an inter-class scatter matrix and intra-class scatter matrix to enhance the compactness of each cluster while maximizing the distance between cluster marginal points. In order to matching the test image with the training set, a cosine similarity based Bayesian classification was used. The proposed method was tested on the Cohn-Kanade database and JAFFE database. It was observed that the proposed method which incorporates spatial information to construct an optimal feature space outperforms the standard PCA and FDA based methods.

Keywords: facial expression recognition, principle component analysis (PCA), fisher discernment analysis (FDA), eigen-filter, cosine similarity, bayesian classifier, f-measure

Procedia PDF Downloads 401
1804 A Nonlinear Feature Selection Method for Hyperspectral Image Classification

Authors: Pei-Jyun Hsieh, Cheng-Hsuan Li, Bor-Chen Kuo

Abstract:

For hyperspectral image classification, feature reduction is an important pre-processing for avoiding the Hughes phenomena due to the difficulty for collecting training samples. Hence, lots of researches developed feature selection methods such as F-score, HSIC (Hilbert-Schmidt Independence Criterion), and etc., to improve hyperspectral image classification. However, most of them only consider the class separability in the original space, i.e., a linear class separability. In this study, we proposed a nonlinear class separability measure based on kernel trick for selecting an appropriate feature subset. The proposed nonlinear class separability was formed by a generalized RBF kernel with different bandwidths with respect to different features. Moreover, it considered the within-class separability and the between-class separability. A genetic algorithm was applied to tune these bandwidths such that the smallest with-class separability and the largest between-class separability simultaneously. This indicates the corresponding feature space is more suitable for classification. In addition, the corresponding nonlinear classification boundary can separate classes very well. These optimal bandwidths also show the importance of bands for hyperspectral image classification. The reciprocals of these bandwidths can be viewed as weights of bands. The smaller bandwidth, the larger weight of the band, and the more importance for classification. Hence, the descending order of the reciprocals of the bands gives an order for selecting the appropriate feature subsets. In the experiments, three hyperspectral image data sets, the Indian Pine Site data set, the PAVIA data set, and the Salinas A data set, were used to demonstrate the selected feature subsets by the proposed nonlinear feature selection method are more appropriate for hyperspectral image classification. Only ten percent of samples were randomly selected to form the training dataset. All non-background samples were used to form the testing dataset. The support vector machine was applied to classify these testing samples based on selected feature subsets. According to the experiments on the Indian Pine Site data set with 220 bands, the highest accuracies by applying the proposed method, F-score, and HSIC are 0.8795, 0.8795, and 0.87404, respectively. However, the proposed method selects 158 features. F-score and HSIC select 168 features and 217 features, respectively. Moreover, the classification accuracies increase dramatically only using first few features. The classification accuracies with respect to feature subsets of 10 features, 20 features, 50 features, and 110 features are 0.69587, 0.7348, 0.79217, and 0.84164, respectively. Furthermore, only using half selected features (110 features) of the proposed method, the corresponding classification accuracy (0.84168) is approximate to the highest classification accuracy, 0.8795. For other two hyperspectral image data sets, the PAVIA data set and Salinas A data set, we can obtain the similar results. These results illustrate our proposed method can efficiently find feature subsets to improve hyperspectral image classification. One can apply the proposed method to determine the suitable feature subset first according to specific purposes. Then researchers can only use the corresponding sensors to obtain the hyperspectral image and classify the samples. This can not only improve the classification performance but also reduce the cost for obtaining hyperspectral images.

Keywords: hyperspectral image classification, nonlinear feature selection, kernel trick, support vector machine

Procedia PDF Downloads 240
1803 Efficient Human Motion Detection Feature Set by Using Local Phase Quantization Method

Authors: Arwa Alzughaibi

Abstract:

Human Motion detection is a challenging task due to a number of factors including variable appearance, posture and a wide range of illumination conditions and background. So, the first need of such a model is a reliable feature set that can discriminate between a human and a non-human form with a fair amount of confidence even under difficult conditions. By having richer representations, the classification task becomes easier and improved results can be achieved. The Aim of this paper is to investigate the reliable and accurate human motion detection models that are able to detect the human motions accurately under varying illumination levels and backgrounds. Different sets of features are tried and tested including Histogram of Oriented Gradients (HOG), Deformable Parts Model (DPM), Local Decorrelated Channel Feature (LDCF) and Aggregate Channel Feature (ACF). However, we propose an efficient and reliable human motion detection approach by combining Histogram of oriented gradients (HOG) and local phase quantization (LPQ) as the feature set, and implementing search pruning algorithm based on optical flow to reduce the number of false positive. Experimental results show the effectiveness of combining local phase quantization descriptor and the histogram of gradient to perform perfectly well for a large range of illumination conditions and backgrounds than the state-of-the-art human detectors. Areaunder th ROC Curve (AUC) of the proposed method achieved 0.781 for UCF dataset and 0.826 for CDW dataset which indicates that it performs comparably better than HOG, DPM, LDCF and ACF methods.

Keywords: human motion detection, histograms of oriented gradient, local phase quantization, local phase quantization

Procedia PDF Downloads 229
1802 Optimal Pricing Based on Real Estate Demand Data

Authors: Vanessa Kummer, Maik Meusel

Abstract:

Real estate demand estimates are typically derived from transaction data. However, in regions with excess demand, transactions are driven by supply and therefore do not indicate what people are actually looking for. To estimate the demand for housing in Switzerland, search subscriptions from all important Swiss real estate platforms are used. These data do, however, suffer from missing information—for example, many users do not specify how many rooms they would like or what price they would be willing to pay. In economic analyses, it is often the case that only complete data is used. Usually, however, the proportion of complete data is rather small which leads to most information being neglected. Also, the data might have a strong distortion if it is complete. In addition, the reason that data is missing might itself also contain information, which is however ignored with that approach. An interesting issue is, therefore, if for economic analyses such as the one at hand, there is an added value by using the whole data set with the imputed missing values compared to using the usually small percentage of complete data (baseline). Also, it is interesting to see how different algorithms affect that result. The imputation of the missing data is done using unsupervised learning. Out of the numerous unsupervised learning approaches, the most common ones, such as clustering, principal component analysis, or neural networks techniques are applied. By training the model iteratively on the imputed data and, thereby, including the information of all data into the model, the distortion of the first training set—the complete data—vanishes. In a next step, the performances of the algorithms are measured. This is done by randomly creating missing values in subsets of the data, estimating those values with the relevant algorithms and several parameter combinations, and comparing the estimates to the actual data. After having found the optimal parameter set for each algorithm, the missing values are being imputed. Using the resulting data sets, the next step is to estimate the willingness to pay for real estate. This is done by fitting price distributions for real estate properties with certain characteristics, such as the region or the number of rooms. Based on these distributions, survival functions are computed to obtain the functional relationship between characteristics and selling probabilities. Comparing the survival functions shows that estimates which are based on imputed data sets do not differ significantly from each other; however, the demand estimate that is derived from the baseline data does. This indicates that the baseline data set does not include all available information and is therefore not representative for the entire sample. Also, demand estimates derived from the whole data set are much more accurate than the baseline estimation. Thus, in order to obtain optimal results, it is important to make use of all available data, even though it involves additional procedures such as data imputation.

Keywords: demand estimate, missing-data imputation, real estate, unsupervised learning

Procedia PDF Downloads 259
1801 A Clustering Algorithm for Massive Texts

Authors: Ming Liu, Chong Wu, Bingquan Liu, Lei Chen

Abstract:

Internet users have to face the massive amount of textual data every day. Organizing texts into categories can help users dig the useful information from large-scale text collection. Clustering, in fact, is one of the most promising tools for categorizing texts due to its unsupervised characteristic. Unfortunately, most of traditional clustering algorithms lose their high qualities on large-scale text collection. This situation mainly attributes to the high- dimensional vectors generated from texts. To effectively and efficiently cluster large-scale text collection, this paper proposes a vector reconstruction based clustering algorithm. Only the features that can represent the cluster are preserved in cluster’s representative vector. This algorithm alternately repeats two sub-processes until it converges. One process is partial tuning sub-process, where feature’s weight is fine-tuned by iterative process. To accelerate clustering velocity, an intersection based similarity measurement and its corresponding neuron adjustment function are proposed and implemented in this sub-process. The other process is overall tuning sub-process, where the features are reallocated among different clusters. In this sub-process, the features useless to represent the cluster are removed from cluster’s representative vector. Experimental results on the three text collections (including two small-scale and one large-scale text collections) demonstrate that our algorithm obtains high quality on both small-scale and large-scale text collections.

Keywords: vector reconstruction, large-scale text clustering, partial tuning sub-process, overall tuning sub-process

Procedia PDF Downloads 404
1800 Video Text Information Detection and Localization in Lecture Videos Using Moments

Authors: Belkacem Soundes, Guezouli Larbi

Abstract:

This paper presents a robust and accurate method for text detection and localization over lecture videos. Frame regions are classified into text or background based on visual feature analysis. However, lecture video shows significant degradation mainly related to acquisition conditions, camera motion and environmental changes resulting in low quality videos. Hence, affecting feature extraction and description efficiency. Moreover, traditional text detection methods cannot be directly applied to lecture videos. Therefore, robust feature extraction methods dedicated to this specific video genre are required for robust and accurate text detection and extraction. Method consists of a three-step process: Slide region detection and segmentation; Feature extraction and non-text filtering. For robust and effective features extraction moment functions are used. Two distinct types of moments are used: orthogonal and non-orthogonal. For orthogonal Zernike Moments, both Pseudo Zernike moments are used, whereas for non-orthogonal ones Hu moments are used. Expressivity and description efficiency are given and discussed. Proposed approach shows that in general, orthogonal moments show high accuracy in comparison to the non-orthogonal one. Pseudo Zernike moments are more effective than Zernike with better computation time.

Keywords: text detection, text localization, lecture videos, pseudo zernike moments

Procedia PDF Downloads 122
1799 Musical Instruments Classification Using Machine Learning Techniques

Authors: Bhalke D. G., Bormane D. S., Kharate G. K.

Abstract:

This paper presents classification of musical instrument using machine learning techniques. The classification has been carried out using temporal, spectral, cepstral and wavelet features. Detail feature analysis is carried out using separate and combined features. Further, instrument model has been developed using K-Nearest Neighbor and Support Vector Machine (SVM). Benchmarked McGill university database has been used to test the performance of the system. Experimental result shows that SVM performs better as compared to KNN classifier.

Keywords: feature extraction, SVM, KNN, musical instruments

Procedia PDF Downloads 454
1798 A Framework for Analyzing Public Interaction of Saudi Universities on Twitter

Authors: Sahar Al-Qahtani, Rabeeh Ayaz Abbasi, Naif Radi Aljohani

Abstract:

Many universities use social media platforms as new communication channels to disseminate information and promptly communicate with their audience. As Twitter is one of the widely used social media platforms, this research aims to explore the adaption and utilization of Twitter by universities. We propose a framework called 'Social Network Analysis for Universities on Twitter' (SNAUT) to analyze the usage of Twitter by universities and to measure their interaction with public. The study includes a sample of around 110,000 tweets from 36 Saudi universities, including both public and private universities. Using SNAUT, we can (1) investigate the purpose of using Twitter by universities, (2) determine the broad topics discussed by them, and (3) identify the groups closely associated with the universities. The results show that most of the Saudi universities (whether public or private) actively use Twitter. Results also reveal that public universities respond to public queries more frequently, but private universities stand out more in terms of information dissemination using retweets and diverse hashtags. Finally, we develop a ranking mechanism in SNAUT for ranking universities based on their social interaction with the public on Twitter.

Keywords: social media, twitter, social network analysis, universities, higher education, Saudi Arabia

Procedia PDF Downloads 103
1797 Dynamic Distribution Calibration for Improved Few-Shot Image Classification

Authors: Majid Habib Khan, Jinwei Zhao, Xinhong Hei, Liu Jiedong, Rana Shahzad Noor, Muhammad Imran

Abstract:

Deep learning is increasingly employed in image classification, yet the scarcity and high cost of labeled data for training remain a challenge. Limited samples often lead to overfitting due to biased sample distribution. This paper introduces a dynamic distribution calibration method for few-shot learning. Initially, base and new class samples undergo normalization to mitigate disparate feature magnitudes. A pre-trained model then extracts feature vectors from both classes. The method dynamically selects distribution characteristics from base classes (both adjacent and remote) in the embedding space, using a threshold value approach for new class samples. Given the propensity of similar classes to share feature distributions like mean and variance, this research assumes a Gaussian distribution for feature vectors. Subsequently, distributional features of new class samples are calibrated using a corrected hyperparameter, derived from the distribution features of both adjacent and distant base classes. This calibration augments the new class sample set. The technique demonstrates significant improvements, with up to 4% accuracy gains in few-shot classification challenges, as evidenced by tests on miniImagenet and CUB datasets.

Keywords: deep learning, computer vision, image classification, few-shot learning, threshold

Procedia PDF Downloads 31
1796 A New Heuristic Algorithm for Maximization Total Demands of Nodes and Number of Covered Nodes Simultaneously

Authors: Ehsan Saghehei, Mahdi Eghbali

Abstract:

The maximal covering location problem (MCLP) was originally developed to determine a set of facility locations which would maximize the total customers' demand serviced by the facilities within a predetermined critical service criterion. However, on some problems that differences between the demand nodes are covered or the number of nodes each node is large, the method of solving MCLP may ignore these differences. In this paper, Heuristic solution based on the ranking of demands in each node and the number of nodes covered by each node according to a predetermined critical value is proposed. The output of this method is to maximize total demands of nodes and number of covered nodes, simultaneously. Furthermore, by providing an example, the solution algorithm is described and its results are compared with Greedy and Lagrange algorithms. Also, the results of the algorithm to solve the larger problem sizes that compared with other methods are provided. A summary and future works conclude the paper.

Keywords: heuristic solution, maximal covering location problem, ranking, set covering

Procedia PDF Downloads 545
1795 The Impact of University League Tables on the Development of Non-Elite Universities. A Case Study of England

Authors: Lois Cheung

Abstract:

This article examines the impact of League Tables on non-elite universities in the English higher education system. The purpose of this study is to explore the use of rankings in strategic planning by low-ranked universities in this highly competitive higher education market. A sample of non-elite universities was selected for a content analysis based on the measures used by The Guardian rankings. Interestingly, these universities care about their rankings within a single national system. The content analysis appears to be an effective approach to investigating the presence of such influences. It is particularly noteworthy that all sampled universities use these measure terminologies in their strategic plans, missions and news coverage on their institutional web-pages. This analysis may be an example of the key challenges that many low-ranking universities in England are probably facing in the highly competitive and diversified higher education market. These universities use rankings to communicate with their stakeholders, mainly students, in order to fill places to secure their major source of funding. The study concludes with comments on the likely effects of the rankings paradigm in undermining the contributions of non-elite universities.

Keywords: League tables, measures, post-1992 universities, ranking, strategy

Procedia PDF Downloads 152
1794 Efficiency Measurement of Turkish via the Stochastic Frontier Model

Authors: Yeliz Mert Kantar, İsmail Yeni̇lmez, Ibrahim Arik

Abstract:

In this study, the efficiency measurement of the top fifty Turkish Universities has been conducted. The top fifty Turkish Universities are listed by The Scientific and Technological Research Council of Turkey (TÜBITAK) according to the Entrepreneur and Innovative University Index every year. The index is calculated based on four components since 2018. Four components are scientific and technological research competency, intellectual property pool, cooperation and interaction, and economic and social contribution. The four components consist of twenty-three sub-components. The 2021 list announced in January 2022 is discussed in this study. Efficiency analysis have been carried out using the Stochastic Frontier Model. Statistical significance of the sub-components that make up the index with certain weights has been examined in terms of the efficiency measurement calculated through the Stochastic Frontier Model. The relationship between the efficiency ranking estimated based on the Stochastic Frontier Model and the Entrepreneur and Innovative University Index ranking is discussed in detail.

Keywords: efficiency, entrepreneur and innovative universities, turkish universities, stochastic frontier model, tübi̇tak

Procedia PDF Downloads 63
1793 Enhancing Financial Security: Real-Time Anomaly Detection in Financial Transactions Using Machine Learning

Authors: Ali Kazemi

Abstract:

The digital evolution of financial services, while offering unprecedented convenience and accessibility, has also escalated the vulnerabilities to fraudulent activities. In this study, we introduce a distinct approach to real-time anomaly detection in financial transactions, aiming to fortify the defenses of banking and financial institutions against such threats. Utilizing unsupervised machine learning algorithms, specifically autoencoders and isolation forests, our research focuses on identifying irregular patterns indicative of fraud within transactional data, thus enabling immediate action to prevent financial loss. The data we used in this study included the monetary value of each transaction. This is a crucial feature as fraudulent transactions may have distributions of different amounts than legitimate ones, such as timestamps indicating when transactions occurred. Analyzing transactions' temporal patterns can reveal anomalies (e.g., unusual activity in the middle of the night). Also, the sector or category of the merchant where the transaction occurred, such as retail, groceries, online services, etc. Specific categories may be more prone to fraud. Moreover, the type of payment used (e.g., credit, debit, online payment systems). Different payment methods have varying risk levels associated with fraud. This dataset, anonymized to ensure privacy, reflects a wide array of transactions typical of a global banking institution, ranging from small-scale retail purchases to large wire transfers, embodying the diverse nature of potentially fraudulent activities. By engineering features that capture the essence of transactions, including normalized amounts and encoded categorical variables, we tailor our data to enhance model sensitivity to anomalies. The autoencoder model leverages its reconstruction error mechanism to flag transactions that deviate significantly from the learned normal pattern, while the isolation forest identifies anomalies based on their susceptibility to isolation from the dataset's majority. Our experimental results, validated through techniques such as k-fold cross-validation, are evaluated using precision, recall, and the F1 score alongside the area under the receiver operating characteristic (ROC) curve. Our models achieved an F1 score of 0.85 and a ROC AUC of 0.93, indicating high accuracy in detecting fraudulent transactions without excessive false positives. This study contributes to the academic discourse on financial fraud detection and provides a practical framework for banking institutions seeking to implement real-time anomaly detection systems. By demonstrating the effectiveness of unsupervised learning techniques in a real-world context, our research offers a pathway to significantly reduce the incidence of financial fraud, thereby enhancing the security and trustworthiness of digital financial services.

Keywords: anomaly detection, financial fraud, machine learning, autoencoders, isolation forest, transactional data analysis

Procedia PDF Downloads 19
1792 A Hybrid Data Mining Algorithm Based System for Intelligent Defence Mission Readiness and Maintenance Scheduling

Authors: Shivam Dwivedi, Sumit Prakash Gupta, Durga Toshniwal

Abstract:

It is a challenging task in today’s date to keep defence forces in the highest state of combat readiness with budgetary constraints. A huge amount of time and money is squandered in the unnecessary and expensive traditional maintenance activities. To overcome this limitation Defence Intelligent Mission Readiness and Maintenance Scheduling System has been proposed, which ameliorates the maintenance system by diagnosing the condition and predicting the maintenance requirements. Based on new data mining algorithms, this system intelligently optimises mission readiness for imminent operations and maintenance scheduling in repair echelons. With modified data mining algorithms such as Weighted Feature Ranking Genetic Algorithm and SVM-Random Forest Linear ensemble, it improves the reliability, availability and safety, alongside reducing maintenance cost and Equipment Out of Action (EOA) time. The results clearly conclude that the introduced algorithms have an edge over the conventional data mining algorithms. The system utilizing the intelligent condition-based maintenance approach improves the operational and maintenance decision strategy of the defence force.

Keywords: condition based maintenance, data mining, defence maintenance, ensemble, genetic algorithms, maintenance scheduling, mission capability

Procedia PDF Downloads 266
1791 Unsupervised Segmentation Technique for Acute Leukemia Cells Using Clustering Algorithms

Authors: N. H. Harun, A. S. Abdul Nasir, M. Y. Mashor, R. Hassan

Abstract:

Leukaemia is a blood cancer disease that contributes to the increment of mortality rate in Malaysia each year. There are two main categories for leukaemia, which are acute and chronic leukaemia. The production and development of acute leukaemia cells occurs rapidly and uncontrollable. Therefore, if the identification of acute leukaemia cells could be done fast and effectively, proper treatment and medicine could be delivered. Due to the requirement of prompt and accurate diagnosis of leukaemia, the current study has proposed unsupervised pixel segmentation based on clustering algorithm in order to obtain a fully segmented abnormal white blood cell (blast) in acute leukaemia image. In order to obtain the segmented blast, the current study proposed three clustering algorithms which are k-means, fuzzy c-means and moving k-means algorithms have been applied on the saturation component image. Then, median filter and seeded region growing area extraction algorithms have been applied, to smooth the region of segmented blast and to remove the large unwanted regions from the image, respectively. Comparisons among the three clustering algorithms are made in order to measure the performance of each clustering algorithm on segmenting the blast area. Based on the good sensitivity value that has been obtained, the results indicate that moving k-means clustering algorithm has successfully produced the fully segmented blast region in acute leukaemia image. Hence, indicating that the resultant images could be helpful to haematologists for further analysis of acute leukaemia.

Keywords: acute leukaemia images, clustering algorithms, image segmentation, moving k-means

Procedia PDF Downloads 258
1790 Bayesian Network and Feature Selection for Rank Deficient Inverse Problem

Authors: Kyugneun Lee, Ikjin Lee

Abstract:

Parameter estimation with inverse problem often suffers from unfavorable conditions in the real world. Useless data and many input parameters make the problem complicated or insoluble. Data refinement and reformulation of the problem can solve that kind of difficulties. In this research, a method to solve the rank deficient inverse problem is suggested. A multi-physics system which has rank deficiency caused by response correlation is treated. Impeditive information is removed and the problem is reformulated to sequential estimations using Bayesian network (BN) and subset groups. At first, subset grouping of the responses is performed. Feature selection with singular value decomposition (SVD) is used for the grouping. Next, BN inference is used for sequential conditional estimation according to the group hierarchy. Directed acyclic graph (DAG) structure is organized to maximize the estimation ability. Variance ratio of response to noise is used to pairing the estimable parameters by each response.

Keywords: Bayesian network, feature selection, rank deficiency, statistical inverse analysis

Procedia PDF Downloads 284
1789 Using Greywolf Optimized Machine Learning Algorithms to Improve Accuracy for Predicting Hospital Readmission for Diabetes

Authors: Vincent Liu

Abstract:

Machine learning algorithms (ML) can achieve high accuracy in predicting outcomes compared to classical models. Metaheuristic, nature-inspired algorithms can enhance traditional ML algorithms by optimizing them such as by performing feature selection. We compare ten ML algorithms to predict 30-day hospital readmission rates for diabetes patients in the US using a dataset from UCI Machine Learning Repository with feature selection performed by Greywolf nature-inspired algorithm. The baseline accuracy for the initial random forest model was 65%. After performing feature engineering, SMOTE for class balancing, and Greywolf optimization, the machine learning algorithms showed better metrics, including F1 scores, accuracy, and confusion matrix with improvements ranging in 10%-30%, and a best model of XGBoost with an accuracy of 95%. Applying machine learning this way can improve patient outcomes as unnecessary rehospitalizations can be prevented by focusing on patients that are at a higher risk of readmission.

Keywords: diabetes, machine learning, 30-day readmission, metaheuristic

Procedia PDF Downloads 21
1788 A Relational Case-Based Reasoning Framework for Project Delivery System Selection

Authors: Yang Cui, Yong Qiang Chen

Abstract:

An appropriate project delivery system (PDS) is crucial to the success of a construction project. Case-based reasoning (CBR) is a useful support for PDS selection. However, the traditional CBR approach represents cases as attribute-value vectors without taking relations among attributes into consideration, and could not calculate the similarity when the structures of cases are not strictly same. Therefore, this paper solves this problem by adopting the relational case-based reasoning (RCBR) approach for PDS selection, considering both the structural similarity and feature similarity. To develop the feature terms of the construction projects, the criteria and factors governing PDS selection process are first identified. Then, feature terms for the construction projects are developed. Finally, the mechanism of similarity calculation and a case study indicate how RCBR works for PDS selection. The adoption of RCBR in PDS selection expands the scope of application of traditional CBR method and improves the accuracy of the PDS selection system.

Keywords: relational cased-based reasoning, case-based reasoning, project delivery system, PDS selection

Procedia PDF Downloads 396
1787 Smartphone-Based Human Activity Recognition by Machine Learning Methods

Authors: Yanting Cao, Kazumitsu Nawata

Abstract:

As smartphones upgrading, their software and hardware are getting smarter, so the smartphone-based human activity recognition will be described as more refined, complex, and detailed. In this context, we analyzed a set of experimental data obtained by observing and measuring 30 volunteers with six activities of daily living (ADL). Due to the large sample size, especially a 561-feature vector with time and frequency domain variables, cleaning these intractable features and training a proper model becomes extremely challenging. After a series of feature selection and parameters adjustment, a well-performed SVM classifier has been trained.

Keywords: smart sensors, human activity recognition, artificial intelligence, SVM

Procedia PDF Downloads 120
1786 Iris Feature Extraction and Recognition Based on Two-Dimensional Gabor Wavelength Transform

Authors: Bamidele Samson Alobalorun, Ifedotun Roseline Idowu

Abstract:

Biometrics technologies apply the human body parts for their unique and reliable identification based on physiological traits. The iris recognition system is a biometric–based method for identification. The human iris has some discriminating characteristics which provide efficiency to the method. In order to achieve this efficiency, there is a need for feature extraction of the distinct features from the human iris in order to generate accurate authentication of persons. In this study, an approach for an iris recognition system using 2D Gabor for feature extraction is applied to iris templates. The 2D Gabor filter formulated the patterns that were used for training and equally sent to the hamming distance matching technique for recognition. A comparison of results is presented using two iris image subjects of different matching indices of 1,2,3,4,5 filter based on the CASIA iris image database. By comparing the two subject results, the actual computational time of the developed models, which is measured in terms of training and average testing time in processing the hamming distance classifier, is found with best recognition accuracy of 96.11% after capturing the iris localization or segmentation using the Daughman’s Integro-differential, the normalization is confined to the Daugman’s rubber sheet model.

Keywords: Daugman rubber sheet, feature extraction, Hamming distance, iris recognition system, 2D Gabor wavelet transform

Procedia PDF Downloads 36
1785 Evaluating Contextually Targeted Advertising with Attention Measurement

Authors: John Hawkins, Graham Burton

Abstract:

Contextual targeting is a common strategy for advertising that places marketing messages in media locations that are expected to be aligned with the target audience. There are multiple major challenges to contextual targeting: the ideal categorisation scheme needs to be known, as well as the most appropriate subsections of that scheme for a given campaign or creative. In addition, the campaign reach is typically limited when targeting becomes narrow, so a balance must be struck between requirements. Finally, refinement of the process is limited by the use of evaluation methods that are either rapid but non-specific (click through rates), or reliable but slow and costly (conversions or brand recall studies). In this study we evaluate the use of attention measurement as a technique for understanding the performance of targeting on the basis of specific contextual topics. We perform the analysis using a large scale dataset of impressions categorised using the iAB V2.0 taxonomy. We evaluate multiple levels of the categorisation hierarchy, using categories at different positions within an initial creative specific ranking. The results illustrate that measuring attention time is an affective signal for the performance of a specific creative within a specific context. Performance is sustained across a ranking of categories from one period to another.

Keywords: contextual targeting, digital advertising, attention measurement, marketing performance

Procedia PDF Downloads 79
1784 Deployed Confidence: The Testing in Production

Authors: Shreya Asthana

Abstract:

Testers know that the feature they tested on stage is working perfectly in production only after release went live. Sometimes something breaks in production and testers get to know through the end user’s bug raised. The panic mode starts when your staging test results do not reflect current production behavior. And you started doubting your testing skills when finally the user reported a bug to you. Testers can deploy their confidence on release day by testing on production. Once you start doing testing in production, you will see test result accuracy because it will be running on real time data and execution will be a little faster as compared to staging one due to elimination of bad data. Feature flagging, canary releases, and data cleanup can help to achieve this technique of testing. By this paper it will be easier to understand the steps to achieve production testing before making your feature live, and to modify IT company’s testing procedure, so testers can provide the bug free experience to the end users. This study is beneficial because too many people think that testing should be done in staging but not in production and now this is high time to pull out people from their old mindset of testing into a new testing world. At the end of the day, it all just matters if the features are working in production or not.

Keywords: bug free production, new testing mindset, testing strategy, testing approach

Procedia PDF Downloads 30
1783 A Combined Feature Extraction and Thresholding Technique for Silence Removal in Percussive Sounds

Authors: B. Kishore Kumar, Pogula Rakesh, T. Kishore Kumar

Abstract:

The music analysis is a part of the audio content analysis used to analyze the music by using the different features of audio signal. In music analysis, the first step is to divide the music signal to different sections based on the feature profiles of the music signal. In this paper, we present a music segmentation technique that will effectively segmentize the signal and thresholding technique to remove silence from the percussive sounds produced by percussive instruments, which uses two features of music, namely signal energy and spectral centroid. The proposed method impose thresholds on both the features which will vary depends on the music signal. Depends on the threshold, silence part is removed and the segmentation is done. The effectiveness of the proposed method is analyzed using MATLAB.

Keywords: percussive sounds, spectral centroid, spectral energy, silence removal, feature extraction

Procedia PDF Downloads 559
1782 An Object-Based Image Resizing Approach

Authors: Chin-Chen Chang, I-Ta Lee, Tsung-Ta Ke, Wen-Kai Tai

Abstract:

Common methods for resizing image size include scaling and cropping. However, these two approaches have some quality problems for reduced images. In this paper, we propose an image resizing algorithm by separating the main objects and the background. First, we extract two feature maps, namely, an enhanced visual saliency map and an improved gradient map from an input image. After that, we integrate these two feature maps to an importance map. Finally, we generate the target image using the importance map. The proposed approach can obtain desired results for a wide range of images.

Keywords: energy map, visual saliency, gradient map, seam carving

Procedia PDF Downloads 453
1781 Examining How Youth Use Mobile Devices for Health Information: Preliminary Findings of a Survey Study with High School Students in Croatia

Authors: Sung Un Kim, Ivana Martinović, Snježana Stanarević Katavić

Abstract:

As more and more youth use mobile devices, such as tablets and smartphones, for information seeking in their everyday lives, the purpose of this study is to understand the behaviors of youth seeking health information on mobile devices. The specific objective of this study is to examine 1) for what health issues youth use mobile devices, 2) for what reasons youth use mobile devices to obtain health information, 3) in what ways youth use mobile devices for health information, and 4) the features of health applications that youth find useful. The researchers devised a questionnaire for this study. Four hundred eight students from two high schools, located in Osijek, Croatia, participated by answering the questionnaire (281 girls and 127 boys). The collected data were analyzed using descriptive statistics and content analysis. The results show that among all participants, about 85 percent (n = 344) reported having used mobile devices for health information. The most frequent health topic for which they had been using mobile devices is physical activity (n = 273), followed by eating issues and nutrition (n = 224), mental health (n = 160), sexual health (n = 157), alcohol, drugs, and tobacco (n = 125), safety (n = 96) and particular diseases (n = 62). They use mobile devices to obtain health information due to the ease of use (n = 342), the ease of sharing health information (n = 281), portability (n = 215), timeliness (n = 162), and the ease of tracking/recording/monitoring health status (n = 147). Of those who have used mobile devices for health information, three-quarters (n = 261) use mobile devices to search health information, while 32.8% (n =113) use applications and 31.7% (n =109) browse information. Those who have used applications for health information (n = 113) consider the alert feature (n=107) as the most useful, followed by the tracking/recording/monitoring feature (n =92), the customized information feature (n = 86), the video feature (n = 58), and the sharing feature (n =39). It is notable that although health applications have been actively developed and studied, a majority of the participants search for or browse information on mobile devices, instead of using applications. The researchers will discuss reasons that some of them did not use mobile devices to obtain health information, students’ concerns about using health applications, and features that they wish to have in health applications.

Keywords: Croatia, health information, information seeking behaviors, mobile devices, youth

Procedia PDF Downloads 364
1780 The Effect of Cognitively-Induced Self-Construal and Direct Behavioral Mimicry on Prosocial Behavior

Authors: Czar Matthew Gerard Dayday, Danielle Marie Estrera, Philippe Jefferson Galban, Gabrielle Marie Heredia

Abstract:

The study aimed to examine the effects of self-construal and direct mimicry on prosocial behavior. The study made use of a 2 (Self-construal: independent or interdependent) x 2 (Mimicry: mimicry or non-mimicry) between subjects factorial design where effects of self-construal was cognitively-induced through a story with varying pronouns (We, Us, Ourselves vs. Me, I, Myself), and prosocial behavior was measured with the amount of money donated to a fabricated advocacy. The research was conducted with a convenience sampling comprised of 88 undergraduate students (58 Females, 33 Males) aged 16 to 26 years olds from the University of the Philippines, Diliman. Results from the experiment show that both factors do not have significant main effects on prosocial behavior. Additionally, their interaction also does not have a significant effect to prosocial behavior with No Mimicry x Independent ranking highest in amount of money donated and Mimicry x Interdependent ranking lowest. These results can be attributed to multiple factors, which include the collectivist orientation and sense of kapwa of Filipinos, a role reversal in the methodology and the lack of Chameleon Effect, and a weak priming of self-construal with respect to self-relatedness.

Keywords: behavior, mimicry, prosocial, self-construal

Procedia PDF Downloads 244
1779 Possibility Theory Based Multi-Attribute Decision-Making: Application in Facility Location-Selection Problem under Uncertain and Extreme Environment

Authors: Bezhan Ghvaberidze

Abstract:

A fuzzy multi-objective facility location-selection problem (FLSP) under uncertain and extreme environments based on possibility theory is developed. The model’s uncertain parameters in the q-rung orthopair fuzzy values are presented and transformed in the Dempster-Shaper’s belief structure environment. An objective function – distribution centers’ selection ranking index as an extension of Dempster’s extremal expectations under discrimination q-rung orthopair fuzzy information is constructed. Experts evaluate each humanitarian aid from distribution centers (HADC) against each of the uncertain factors. HADCs location problem is reduced to the bicriteria problem of partitioning the set of customers by the set of centers: (1) – Minimization of transportation costs; (2) – Maximization of centers’ selection ranking indexes. Partitioning type constraints are also constructed. For an illustration of the obtained results, a numerical example is created from the facility location-selection problem.

Keywords: FLSP, multi-objective combinatorial optimization problem, evidence theory, HADC, q-rung orthopair fuzzy set, possibility theory

Procedia PDF Downloads 79
1778 Regulating Information Asymmetries at Online Platforms for Short-Term Vacation Rental in European Union– Legal Conondrum Continues

Authors: Vesna Lukovic

Abstract:

Online platforms as new business models play an important role in today’s economy and the functioning of the EU’s internal market. In the travel industry, algorithms used by online platforms for short-stay accommodation provide suggestions and price information to travelers. Those suggestions and recommendations are displayed in search results via recommendation (ranking) systems. There has been a growing consensus that the current legal framework was not sufficient to resolve problems arising from platform practices. In order to enhance the potential of the EU’s Single Market, smaller businesses should be protected, and their rights strengthened vis-à-vis large online platforms. The Regulation (EU) 2019/1150 of the European Parliament and of the Council on promoting fairness and transparency for business users of online intermediation services aims to level the playing field in that respect. This research looks at Airbnb through the lenses of this regulation. The research explores key determinants and finds that although regulation is an important step in the right direction, it is not enough. It does not entail sufficient clarity obligations that would make online platforms an intermediary service which both accommodation providers and travelers could use with ease.

Keywords: algorithm, online platforms, ranking, consumers, EU regulation

Procedia PDF Downloads 105