Search results for: unsupervised classification
429 Deficiencies of Lung Segmentation Techniques using CT Scan Images for CAD
Authors: Nisar Ahmed Memon, Anwar Majid Mirza, S.A.M. Gilani
Abstract:
Segmentation is an important step in medical image analysis and classification for radiological evaluation or computer aided diagnosis. This paper presents the problem of inaccurate lung segmentation as observed in algorithms presented by researchers working in the area of medical image analysis. The different lung segmentation techniques have been tested using the dataset of 19 patients consisting of a total of 917 images. We obtained datasets of 11 patients from Ackron University, USA and of 8 patients from AGA Khan Medical University, Pakistan. After testing the algorithms against datasets, the deficiencies of each algorithm have been highlighted.Keywords: Computer Aided Diagnosis (CAD), MathematicalMorphology, Medical Image Analysis, Region Growing, Segmentation, Thresholding,
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2340428 Improving Academic Performance Prediction using Voting Technique in Data Mining
Authors: Ikmal Hisyam Mohamad Paris, Lilly Suriani Affendey, Norwati Mustapha
Abstract:
In this paper we compare the accuracy of data mining methods to classifying students in order to predicting student-s class grade. These predictions are more useful for identifying weak students and assisting management to take remedial measures at early stages to produce excellent graduate that will graduate at least with second class upper. Firstly we examine single classifiers accuracy on our data set and choose the best one and then ensembles it with a weak classifier to produce simple voting method. We present results show that combining different classifiers outperformed other single classifiers for predicting student performance.Keywords: Classification, Data Mining, Prediction, Combination of Multiple Classifiers.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2754427 Using Fractional Factorial Designs for Variable Importance in Random Forest Models
Authors: Ewa. M. Sztendur, Neil T. Diamond
Abstract:
Random Forests are a powerful classification technique, consisting of a collection of decision trees. One useful feature of Random Forests is the ability to determine the importance of each variable in predicting the outcome. This is done by permuting each variable and computing the change in prediction accuracy before and after the permutation. This variable importance calculation is similar to a one-factor-at a time experiment and therefore is inefficient. In this paper, we use a regular fractional factorial design to determine which variables to permute. Based on the results of the trials in the experiment, we calculate the individual importance of the variables, with improved precision over the standard method. The method is illustrated with a study of student attrition at Monash University.
Keywords: Random Forests, Variable Importance, Fractional Factorial Designs, Student Attrition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1997426 Framework and Characterization of Physical Internet
Authors: Charifa Fergani, Adiba El Bouzekri El Idrissi, Suzanne Marcotte, Abdelowahed Hajjaji
Abstract:
Over the last years, a new paradigm known as Physical Internet has been developed, and studied in logistics management. The purpose of this global and open system is to deal with logistics grand challenge by setting up an efficient and sustainable Logistics Web. The purpose of this paper is to review scientific articles dedicated to Physical Internet topic, and to provide a clustering strategy enabling to classify the literature on the Physical Internet, to follow its evolution, as well as to criticize it. The classification is based on three factors: Logistics Web, organization, and resources. Several papers about Physical Internet have been classified and analyzed along the Logistics Web, resources and organization views at a strategic, tactical and operational level, respectively. A developed cluster analysis shows which topics of the Physical Internet that are the less covered actually. Future researches are outlined for these topics.Keywords: Logistics web, Physical Internet, PI characterization, taxonomy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 857425 Joint Use of Factor Analysis (FA) and Data Envelopment Analysis (DEA) for Ranking of Data Envelopment Analysis
Authors: Reza Nadimi, Fariborz Jolai
Abstract:
This article combines two techniques: data envelopment analysis (DEA) and Factor analysis (FA) to data reduction in decision making units (DMU). Data envelopment analysis (DEA), a popular linear programming technique is useful to rate comparatively operational efficiency of decision making units (DMU) based on their deterministic (not necessarily stochastic) input–output data and factor analysis techniques, have been proposed as data reduction and classification technique, which can be applied in data envelopment analysis (DEA) technique for reduction input – output data. Numerical results reveal that the new approach shows a good consistency in ranking with DEA.Keywords: Effectiveness, Decision Making, Data EnvelopmentAnalysis, Factor Analysis
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2425424 EHW from Consumer Point of View: Consumer-Triggered Evolution
Authors: Yerbol Sapargaliyev, Tatiana Kalganova
Abstract:
Evolvable Hardware (EHW) has been regarded as adaptive system acquired by wide application market. Consumer market of any good requires diversity to satisfy consumers- preferences. Adaptation of EHW is a key technology that could provide individual approach to every particular user. This situation raises a question: how to set target for evolutionary algorithm? The existing techniques do not allow consumer to influence evolutionary process. Only designer at the moment is capable to influence the evolution. The proposed consumer-triggered evolution overcomes this problem by introducing new features to EHW that help adaptive system to obtain targets during consumer stage. Classification of EHW is given according to responsiveness, imitation of human behavior and target circuit response. Home intelligent water heating system is considered as an example.
Keywords: Actuators, consumer-triggered evolution, evolvable hardware, sensors.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1486423 A New Face Recognition Method using PCA, LDA and Neural Network
Authors: A. Hossein Sahoolizadeh, B. Zargham Heidari, C. Hamid Dehghani
Abstract:
In this paper, a new face recognition method based on PCA (principal Component Analysis), LDA (Linear Discriminant Analysis) and neural networks is proposed. This method consists of four steps: i) Preprocessing, ii) Dimension reduction using PCA, iii) feature extraction using LDA and iv) classification using neural network. Combination of PCA and LDA is used for improving the capability of LDA when a few samples of images are available and neural classifier is used to reduce number misclassification caused by not-linearly separable classes. The proposed method was tested on Yale face database. Experimental results on this database demonstrated the effectiveness of the proposed method for face recognition with less misclassification in comparison with previous methods.Keywords: Face recognition Principal component analysis, Linear discriminant analysis, Neural networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3213422 Target Detection with Improved Image Texture Feature Coding Method and Support Vector Machine
Authors: R. Xu, X. Zhao, X. Li, C. Kwan, C.-I Chang
Abstract:
An image texture analysis and target recognition approach of using an improved image texture feature coding method (TFCM) and Support Vector Machine (SVM) for target detection is presented. With our proposed target detection framework, targets of interest can be detected accurately. Cascade-Sliding-Window technique was also developed for automated target localization. Application to mammogram showed that over 88% of normal mammograms and 80% of abnormal mammograms can be correctly identified. The approach was also successfully applied to Synthetic Aperture Radar (SAR) and Ground Penetrating Radar (GPR) images for target detection.
Keywords: Image texture analysis, feature extraction, target detection, pattern classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1780421 Data Analysis Techniques for Predictive Maintenance on Fleet of Heavy-Duty Vehicles
Authors: Antonis Sideris, Elias Chlis Kalogeropoulos, Konstantia Moirogiorgou
Abstract:
The present study proposes a methodology for the efficient daily management of fleet vehicles and construction machinery. The application covers the area of remote monitoring of heavy-duty vehicles operation parameters, where specific sensor data are stored and examined in order to provide information about the vehicle’s health. The vehicle diagnostics allow the user to inspect whether maintenance tasks need to be performed before a fault occurs. A properly designed machine learning model is proposed for the detection of two different types of faults through classification. Cross validation is used and the accuracy of the trained model is checked with the confusion matrix.
Keywords: Fault detection, feature selection, machine learning, predictive maintenance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 781420 Risk Classification of SMEs by Early Warning Model Based on Data Mining
Authors: Nermin Ozgulbas, Ali Serhan Koyuncugil
Abstract:
One of the biggest problems of SMEs is their tendencies to financial distress because of insufficient finance background. In this study, an Early Warning System (EWS) model based on data mining for financial risk detection is presented. CHAID algorithm has been used for development of the EWS. Developed EWS can be served like a tailor made financial advisor in decision making process of the firms with its automated nature to the ones who have inadequate financial background. Besides, an application of the model implemented which covered 7,853 SMEs based on Turkish Central Bank (TCB) 2007 data. By using EWS model, 31 risk profiles, 15 risk indicators, 2 early warning signals, and 4 financial road maps has been determined for financial risk mitigation.
Keywords: Early Warning Systems, Data Mining, Financial Risk, SMEs.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3387419 Temporary Housing Respond to Disasters in Developing Countries- Case Study: Iran-Ardabil and Lorestan Province Earthquakes
Authors: Farzaneh Hadafi, Alireza Fallahi
Abstract:
Natural Disasters have always occurred through earth life. As human life developed on earth, he faced with different disasters. Since disasters would destroy his living areas and ruin his life, he learned how to respond and overcome to these matters. Nowadays, in the era of industrialized world and informatics, the man kind seeks for stages and classification of pre and post disaster process in order to identify a framework in these circumstances. Because too many parameters complicate these frameworks and proceedings, it seems that this goal has not been properly established yet and the only resource is guidelines of UNDRO (1982) [1]. This paper will discuss about temporary housing as one of an approved stage in disaster management field and investigate the affects of disapproval or dismissal of this at two earthquakes which took place in Iran.
Keywords: Temporary Housing, Temporary Sheltering, DisasterManagement, Iran
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2301418 Evolutionary Feature Selection for Text Documents using the SVM
Authors: Daniel I. Morariu, Lucian N. Vintan, Volker Tresp
Abstract:
Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, we present three feature selection methods: Information Gain, Support Vector Machine feature selection called (SVM_FS) and Genetic Algorithm with SVM (called GA_SVM). We show that the best results were obtained with GA_SVM method for a relatively small dimension of the feature vector.Keywords: Feature Selection, Learning with Kernels, Support Vector Machine, Genetic Algorithm, and Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1706417 Feature Selection Methods for an Improved SVM Classifier
Authors: Daniel Morariu, Lucian N. Vintan, Volker Tresp
Abstract:
Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, three feature selection methods are evaluated: Random Selection, Information Gain (IG) and Support Vector Machine feature selection (called SVM_FS). We show that the best results were obtained with SVM_FS method for a relatively small dimension of the feature vector. Also we present a novel method to better correlate SVM kernel-s parameters (Polynomial or Gaussian kernel).Keywords: Feature Selection, Learning with Kernels, SupportVector Machine, and Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1829416 Meta Random Forests
Authors: Praveen Boinee, Alessandro De Angelis, Gian Luca Foresti
Abstract:
Leo Breimans Random Forests (RF) is a recent development in tree based classifiers and quickly proven to be one of the most important algorithms in the machine learning literature. It has shown robust and improved results of classifications on standard data sets. Ensemble learning algorithms such as AdaBoost and Bagging have been in active research and shown improvements in classification results for several benchmarking data sets with mainly decision trees as their base classifiers. In this paper we experiment to apply these Meta learning techniques to the random forests. We experiment the working of the ensembles of random forests on the standard data sets available in UCI data sets. We compare the original random forest algorithm with their ensemble counterparts and discuss the results.Keywords: Random Forests [RF], ensembles, UCI.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2711415 The Presence of Enterobacters (E.Coli and Salmonella spp.) in Industrial Growing Poultry in Albania
Authors: Boci J., Çabeli P., Shtylla T., Kumbe I.
Abstract:
The development of the poultry industry in Albania is mainly based on the existence of intensive modern farms with huge capacities, which often are mixed with other forms. Colibacillosis is commonly displayed regardless of the type of breeding, delivering high mortality in poultry industry. The mechanisms with which pathogen enterobacters are able to cause the infection in poultry are not yet clear. The routine diagnose in the field, followed by isolation of E. coli and species of Salmonella genres in reference laboratories cannot lead in classification or full recognition of circulative strains in a territory, if it is not performed a differentiation among the present microorganisms in intensive farms and those in rural areas. In this study were isolated 1.496 strains of E. coli and 378 Salmonella spp. This study, presents distribution of poultry pathogenosity of E.coli and Salmonella spp., based on the usage of innovative diagnostic methods.
Keywords: poultry, E.coli, Salmonella spp., Enterobacter
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2072414 Technology and Its Social Implications: Myths and Realities in the Interpretation of the Concept
Authors: E. V. Veraszto, J. T. F. Camargo, D. Silva, N. A. Miranda, F. O. Simon, S. F. Amaral, L. V. Freitas
Abstract:
The concept of technology as well as itself has evolved continuously over time, such that, nowadays, this concept is still marked by myths and realities. Even the concept of science is frequently misunderstood as technology. In this way, this paper presents different forms of interpretation of the concept of technology in the course of history, as well as the social and cultural aspects associated with it, through an analysis made by means of insights from sociological studies of science and technology and its multiple relations with society. Through the analysis of contents, the paper presents a classification of how technology is interpreted in the social sphere and search channel efforts to show how a broader understanding can contribute to better interpretations of how scientific and technological development influences the environment in which we operate. The text also presents a particular point of view for the interpretation of the concept from the analysis throughout the whole work.
Keywords: Technology, conceptions of technology, technological myths, definition of technology.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1541413 Pervasive Differentiated Services: A QoS Model for Pervasive Systems
Authors: Sherif G. Aly
Abstract:
In this article, we introduce a mechanism by which the same concept of differentiated services used in network transmission can be applied to provide quality of service levels to pervasive systems applications. The classical DiffServ model, including marking and classification, assured forwarding, and expedited forwarding, are all utilized to create quality of service guarantees for various pervasive applications requiring different levels of quality of service. Through a collection of various sensors, personal devices, and data sources, the transmission of contextsensitive data can automatically occur within a pervasive system with a given quality of service level. Triggers, initiators, sources, and receivers are four entities labeled in our mechanism. An explanation of the role of each is provided, and how quality of service is guaranteed.
Keywords: Pervasive systems, quality of service, differentiated services, mobile devices.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1497412 A Proposed Hybrid Approach for Feature Selection in Text Document Categorization
Authors: M. F. Zaiyadi, B. Baharudin
Abstract:
Text document categorization involves large amount of data or features. The high dimensionality of features is a troublesome and can affect the performance of the classification. Therefore, feature selection is strongly considered as one of the crucial part in text document categorization. Selecting the best features to represent documents can reduce the dimensionality of feature space hence increase the performance. There were many approaches has been implemented by various researchers to overcome this problem. This paper proposed a novel hybrid approach for feature selection in text document categorization based on Ant Colony Optimization (ACO) and Information Gain (IG). We also presented state-of-the-art algorithms by several other researchers.Keywords: Ant colony optimization, feature selection, information gain, text categorization, text representation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2069411 Automatic Threshold Search for Heat Map Based Feature Selection: A Cancer Dataset Analysis
Authors: Carlos Huertas, Reyes Juarez-Ramirez
Abstract:
Public health is one of the most critical issues today; therefore, there is great interest to improve technologies in the area of diseases detection. With machine learning and feature selection, it has been possible to aid the diagnosis of several diseases such as cancer. In this work, we present an extension to the Heat Map Based Feature Selection algorithm, this modification allows automatic threshold parameter selection that helps to improve the generalization performance of high dimensional data such as mass spectrometry. We have performed a comparison analysis using multiple cancer datasets and compare against the well known Recursive Feature Elimination algorithm and our original proposal, the results show improved classification performance that is very competitive against current techniques.Keywords: Feature selection, mass spectrometry, biomarker discovery, cancer.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1589410 A Proposed Approach for Emotion Lexicon Enrichment
Authors: Amr Mansour Mohsen, Hesham Ahmed Hassan, Amira M. Idrees
Abstract:
Document Analysis is an important research field that aims to gather the information by analyzing the data in documents. As one of the important targets for many fields is to understand what people actually want, sentimental analysis field has been one of the vital fields that are tightly related to the document analysis. This research focuses on analyzing text documents to classify each document according to its opinion. The aim of this research is to detect the emotions from text documents based on enriching the lexicon with adapting their content based on semantic patterns extraction. The proposed approach has been presented, and different experiments are applied by different perspectives to reveal the positive impact of the proposed approach on the classification results.Keywords: Document analysis, sentimental analysis, emotion detection, WEKA tool, NRC Lexicon.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1456409 Prediction of Cardiovascular Disease by Applying Feature Extraction
Authors: Nebi Gedik
Abstract:
Heart disease threatens the lives of a great number of people every year around the world. Heart issues lead to many of all deaths; therefore, early diagnosis and treatment are critical. The diagnosis of heart disease is complicated due to several factors affecting health such as high blood pressure, raised cholesterol, an irregular pulse rhythm, and more. Artificial intelligence has the potential to assist in the early detection and treatment of diseases. Improving heart failure prediction is one of the primary goals of research on heart disease risk assessment. This study aims to determine the features that provide the most successful classification prediction in detecting cardiovascular disease. The performances of each feature are compared using the K-Nearest Neighbor machine learning method. The feature that gives the most successful performance has been identified.
Keywords: Cardiovascular disease, feature extraction, supervised learning, k-NN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 134408 Evaluation of Clustering Based on Preprocessing in Gene Expression Data
Authors: Seo Young Kim, Toshimitsu Hamasaki
Abstract:
Microarrays have become the effective, broadly used tools in biological and medical research to address a wide range of problems, including classification of disease subtypes and tumors. Many statistical methods are available for analyzing and systematizing these complex data into meaningful information, and one of the main goals in analyzing gene expression data is the detection of samples or genes with similar expression patterns. In this paper, we express and compare the performance of several clustering methods based on data preprocessing including strategies of normalization or noise clearness. We also evaluate each of these clustering methods with validation measures for both simulated data and real gene expression data. Consequently, clustering methods which are common used in microarray data analysis are affected by normalization and degree of noise and clearness for datasets.
Keywords: Gene expression, clustering, data preprocessing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1740407 Trabecular Texture Analysis Using Fractal Metrics for Bone Fragility Assessment
Authors: Khaled Harrar, Rachid Jennane
Abstract:
The purpose of this study is the discrimination of 28 postmenopausal with osteoporotic femoral fractures from an agematched control group of 28 women using texture analysis based on fractals. Two pre-processing approaches are applied on radiographic images; these techniques are compared to highlight the choice of the pre-processing method. Furthermore, the values of the fractal dimension are compared to those of the fractal signature in terms of the classification of the two populations. In a second analysis, the BMD measure at proximal femur was compared to the fractal analysis, the latter, which is a non-invasive technique, allowed a better discrimination; the results confirm that the fractal analysis of texture on calcaneus radiographs is able to discriminate osteoporotic patients with femoral fracture from controls. This discrimination was efficient compared to that obtained by BMD alone. It was also present in comparing subgroups with overlapping values of BMD.Keywords: Osteoporosis, fractal dimension, fractal signature, bone mineral density.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2328406 Bone Ash Impact on Soil Shear Strength
Authors: G. M. Ayininuola, A. O. Sogunro
Abstract:
Most failures of soil have been attributed to poor shear strength. Consequently, the present paper investigated the suitability of cattle bone ash as a possible additive to improve the shear strength of soils. Four soil samples were collected and stabilized with prepared bone ash in proportions of 3%, 5%, 7%, 10%, 15% and 20% by dry weight. Chemical analyses of the bone ash; followed by classification, compaction, and triaxial shear tests of the treated soil samples were conducted. Results obtained showed that bone ash contained high proportion of calcium oxide and phosphate. Addition of bone ash to soil samples led to increase in soil shear strengths within the range of 22.40% to 105.18% over the strengths of the respective control tests. Conversely, all samples attained maximum shear strengths at 7% bone ash stabilization. The use of bone ash as an additive will therefore improve the shear strength of soils; however, using bone ash quantities in excess of 7% may not yield ample results.
Keywords: Bone ash, Shear strength, Stabilization, Soil.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3558405 Recognition of Isolated Handwritten Latin Characters using One Continuous Route of Freeman Chain Code Representation and Feedforward Neural Network Classifier
Authors: Dewi Nasien, Siti S. Yuhaniz, Habibollah Haron
Abstract:
In a handwriting recognition problem, characters can be represented using chain codes. The main problem in representing characters using chain code is optimizing the length of the chain code. This paper proposes to use randomized algorithm to minimize the length of Freeman Chain Codes (FCC) generated from isolated handwritten characters. Feedforward neural network is used in the classification stage to recognize the image characters. Our test results show that by applying the proposed model, we reached a relatively high accuracy for the problem of isolated handwritten when tested on NIST database.Keywords: Handwriting Recognition, Freeman Chain Code andFeedforward Backpropagation Neural Networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1822404 Index t-SNE: Tracking Dynamics of High-Dimensional Datasets with Coherent Embeddings
Authors: G. Candel, D. Naccache
Abstract:
t-SNE is an embedding method that the data science community has widely used. It helps two main tasks: to display results by coloring items according to the item class or feature value; and for forensic, giving a first overview of the dataset distribution. Two interesting characteristics of t-SNE are the structure preservation property and the answer to the crowding problem, where all neighbors in high dimensional space cannot be represented correctly in low dimensional space. t-SNE preserves the local neighborhood, and similar items are nicely spaced by adjusting to the local density. These two characteristics produce a meaningful representation, where the cluster area is proportional to its size in number, and relationships between clusters are materialized by closeness on the embedding. This algorithm is non-parametric. The transformation from a high to low dimensional space is described but not learned. Two initializations of the algorithm would lead to two different embedding. In a forensic approach, analysts would like to compare two or more datasets using their embedding. A naive approach would be to embed all datasets together. However, this process is costly as the complexity of t-SNE is quadratic, and would be infeasible for too many datasets. Another approach would be to learn a parametric model over an embedding built with a subset of data. While this approach is highly scalable, points could be mapped at the same exact position, making them indistinguishable. This type of model would be unable to adapt to new outliers nor concept drift. This paper presents a methodology to reuse an embedding to create a new one, where cluster positions are preserved. The optimization process minimizes two costs, one relative to the embedding shape and the second relative to the support embedding’ match. The embedding with the support process can be repeated more than once, with the newly obtained embedding. The successive embedding can be used to study the impact of one variable over the dataset distribution or monitor changes over time. This method has the same complexity as t-SNE per embedding, and memory requirements are only doubled. For a dataset of n elements sorted and split into k subsets, the total embedding complexity would be reduced from O(n2) to O(n2/k), and the memory requirement from n2 to 2(n/k)2 which enables computation on recent laptops. The method showed promising results on a real-world dataset, allowing to observe the birth, evolution and death of clusters. The proposed approach facilitates identifying significant trends and changes, which empowers the monitoring high dimensional datasets’ dynamics.
Keywords: Concept drift, data visualization, dimension reduction, embedding, monitoring, reusability, t-SNE, unsupervised learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 489403 Context-aware Recommender Systems using Data Mining Techniques
Authors: Kyoung-jae Kim, Hyunchul Ahn, Sangwon Jeong
Abstract:
This study proposes a novel recommender system to provide the advertisements of context-aware services. Our proposed model is designed to apply a modified collaborative filtering (CF) algorithm with regard to the several dimensions for the personalization of mobile devices – location, time and the user-s needs type. In particular, we employ a classification rule to understand user-s needs type using a decision tree algorithm. In addition, we collect primary data from the mobile phone users and apply them to the proposed model to validate its effectiveness. Experimental results show that the proposed system makes more accurate and satisfactory advertisements than comparative systems.Keywords: Location-based advertisement, Recommender system, Collaborative filtering, User needs type, Mobile user.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2174402 Analysis of Physicochemical Properties on Prediction of R5, X4 and R5X4 HIV-1 Coreceptor Usage
Authors: Kai-Ti Hsu, Hui-Ling Huang, Chun-Wei Tung, Yi-Hsiung Chen, Shinn-Ying Ho
Abstract:
Bioinformatics methods for predicting the T cell coreceptor usage from the array of membrane protein of HIV-1 are investigated. In this study, we aim to propose an effective prediction method for dealing with the three-class classification problem of CXCR4 (X4), CCR5 (R5) and CCR5/CXCR4 (R5X4). We made efforts in investigating the coreceptor prediction problem as follows: 1) proposing a feature set of informative physicochemical properties which is cooperated with SVM to achieve high prediction test accuracy of 81.48%, compared with the existing method with accuracy of 70.00%; 2) establishing a large up-to-date data set by increasing the size from 159 to 1225 sequences to verify the proposed prediction method where the mean test accuracy is 88.59%, and 3) analyzing the set of 14 informative physicochemical properties to further understand the characteristics of HIV-1coreceptors.Keywords: Coreceptor, genetic algorithm, HIV-1, SVM, physicochemical properties, prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2385401 A Review on Light Shafts Rendering for Indoor Scenes
Authors: Hatam H. Ali, Mohd Shahrizal Sunar, Hoshang Kolivand, Mohd Azhar Bin M. Arsad
Abstract:
Rendering light shafts is one of the important topics in computer gaming and interactive applications. The methods and models that are used to generate light shafts play crucial role to make a scene more realistic in computer graphics. This article discusses the image-based shadows and geometric-based shadows that contribute in generating volumetric shadows and light shafts, depending on ray tracing, radiosity, and ray marching technique. The main aim of this study is to provide researchers with background on a progress of light scattering methods so as to make it available for them to determine the technique best suited to their goals. It is also hoped that our classification helps researchers find solutions to the shortcomings of each method.
Keywords: Shaft of lights, realistic images, image-based, and geometric-based.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1612400 Water End-Use Classification with Contemporaneous Water-Energy Data and Deep Learning Network
Authors: Khoi A. Nguyen, Rodney A. Stewart, Hong Zhang
Abstract:
‘Water-related energy’ is energy use which is directly or indirectly influenced by changes to water use. Informatics applying a range of mathematical, statistical and rule-based approaches can be used to reveal important information on demand from the available data provided at second, minute or hourly intervals. This study aims to combine these two concepts to improve the current water end use disaggregation problem through applying a wide range of most advanced pattern recognition techniques to analyse the concurrent high-resolution water-energy consumption data. The obtained results have shown that recognition accuracies of all end-uses have significantly increased, especially for mechanised categories, including clothes washer, dishwasher and evaporative air cooler where over 95% of events were correctly classified.
Keywords: Deep learning network, smart metering, water end use, water-energy data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1363