Search results for: Unsupervised Clustering
189 Improving Fake News Detection Using K-means and Support Vector Machine Approaches
Authors: Kasra Majbouri Yazdi, Adel Majbouri Yazdi, Saeid Khodayi, Jingyu Hou, Wanlei Zhou, Saeed Saedy
Abstract:
Fake news and false information are big challenges of all types of media, especially social media. There is a lot of false information, fake likes, views and duplicated accounts as big social networks such as Facebook and Twitter admitted. Most information appearing on social media is doubtful and in some cases misleading. They need to be detected as soon as possible to avoid a negative impact on society. The dimensions of the fake news datasets are growing rapidly, so to obtain a better result of detecting false information with less computation time and complexity, the dimensions need to be reduced. One of the best techniques of reducing data size is using feature selection method. The aim of this technique is to choose a feature subset from the original set to improve the classification performance. In this paper, a feature selection method is proposed with the integration of K-means clustering and Support Vector Machine (SVM) approaches which work in four steps. First, the similarities between all features are calculated. Then, features are divided into several clusters. Next, the final feature set is selected from all clusters, and finally, fake news is classified based on the final feature subset using the SVM method. The proposed method was evaluated by comparing its performance with other state-of-the-art methods on several specific benchmark datasets and the outcome showed a better classification of false information for our work. The detection performance was improved in two aspects. On the one hand, the detection runtime process decreased, and on the other hand, the classification accuracy increased because of the elimination of redundant features and the reduction of datasets dimensions.
Keywords: Fake news detection, feature selection, support vector machine, K-means clustering, machine learning, social media.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4524188 Unsupervised Segmentation using Fuzzy Logicbased Texture Spectrum for MRI Brain Images
Authors: G.Wiselin Jiji, L.Ganesan
Abstract:
Textures are replications, symmetries and combinations of various basic patterns, usually with some random variation one of the gray-level statistics. This article proposes a new approach to Segment texture images. The proposed approach proceeds in 2 stages. First, in this method, local texture information of a pixel is obtained by fuzzy texture unit and global texture information of an image is obtained by fuzzy texture spectrum. The purpose of this paper is to demonstrate the usefulness of fuzzy texture spectrum for texture Segmentation. The 2nd Stage of the method is devoted to a decision process, applying a global analysis followed by a fine segmentation, which is only focused on ambiguous points. The above Proposed approach was applied to brain image to identify the components of brain in turn, used to locate the brain tumor and its Growth rate.Keywords: Fuzzy Texture Unit, Fuzzy Texture Spectrum, andPattern Recognition, segmentation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1698187 A Family of Distributions on Learnable Problems without Uniform Convergence
Authors: César Garza
Abstract:
In supervised binary classification and regression problems, it is well-known that learnability is equivalent to uniform convergence of the hypothesis class, and if a problem is learnable, it is learnable by empirical risk minimization. For the general learning setting of unsupervised learning tasks, there are non-trivial learning problems where uniform convergence does not hold. We present here the task of learning centers of mass with an extra feature that “activates” some of the coordinates over the unit ball in a Hilbert space. We show that the learning problem is learnable under a stable RLM rule. We introduce a family of distributions over the domain space with some mild restrictions for which the sample complexity of uniform convergence for these problems must grow logarithmically with the dimension of the Hilbert space. If we take this dimension to infinity, we obtain a learnable problem for which the uniform convergence property fails for a vast family of distributions.
Keywords: Statistical learning theory, learnability, uniform convergence, stability, regularized loss minimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 354186 Fusion of Colour and Depth Information to Enhance Wound Tissue Classification
Authors: Darren Thompson, Philip Morrow, Bryan Scotney, John Winder
Abstract:
Patients with diabetes are susceptible to chronic foot wounds which may be difficult to manage and slow to heal. Diagnosis and treatment currently rely on the subjective judgement of experienced professionals. An objective method of tissue assessment is required. In this paper, a data fusion approach was taken to wound tissue classification. The supervised Maximum Likelihood and unsupervised Multi-Modal Expectation Maximisation algorithms were used to classify tissues within simulated wound models by weighting the contributions of both colour and 3D depth information. It was found that, at low weightings, depth information could show significant improvements in classification accuracy when compared to classification by colour alone, particularly when using the maximum likelihood method. However, larger weightings were found to have an entirely negative effect on accuracy.Keywords: Classification, data fusion, diabetic foot, stereophotogrammetry, tissue colour.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1710185 Complex Network Approach to International Trade of Fossil Fuel
Authors: Semanur Soyyiğit Kaya, Ercan Eren
Abstract:
Energy has a prominent role for development of nations. Countries which have energy resources also have strategic power in the international trade of energy since it is essential for all stages of production in the economy. Thus, it is important for countries to analyze the weaknesses and strength of the system. On the other side, international trade is one of the fields that are analyzed as a complex network via network analysis. Complex network is one of the tools to analyze complex systems with heterogeneous agents and interaction between them. A complex network consists of nodes and the interactions between these nodes. Total properties which emerge as a result of these interactions are distinct from the sum of small parts (more or less) in complex systems. Thus, standard approaches to international trade are superficial to analyze these systems. Network analysis provides a new approach to analyze international trade as a network. In this network, countries constitute nodes and trade relations (export or import) constitute edges. It becomes possible to analyze international trade network in terms of high degree indicators which are specific to complex networks such as connectivity, clustering, assortativity/disassortativity, centrality, etc. In this analysis, international trade of crude oil and coal which are types of fossil fuel has been analyzed from 2005 to 2014 via network analysis. First, it has been analyzed in terms of some topological parameters such as density, transitivity, clustering etc. Afterwards, fitness to Pareto distribution has been analyzed via Kolmogorov-Smirnov test. Finally, weighted HITS algorithm has been applied to the data as a centrality measure to determine the real prominence of countries in these trade networks. Weighted HITS algorithm is a strong tool to analyze the network by ranking countries with regards to prominence of their trade partners. We have calculated both an export centrality and an import centrality by applying w-HITS algorithm to the data. As a result, impacts of the trading countries have been presented in terms of high-degree indicators.Keywords: Complex network approach, fossil fuel, international trade, network theory.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2386184 Evolving Knowledge Extraction from Online Resources
Authors: Zhibo Xiao, Tharini Nayanika de Silva, Kezhi Mao
Abstract:
In this paper, we present an evolving knowledge extraction system named AKEOS (Automatic Knowledge Extraction from Online Sources). AKEOS consists of two modules, including a one-time learning module and an evolving learning module. The one-time learning module takes in user input query, and automatically harvests knowledge from online unstructured resources in an unsupervised way. The output of the one-time learning is a structured vector representing the harvested knowledge. The evolving learning module automatically schedules and performs repeated one-time learning to extract the newest information and track the development of an event. In addition, the evolving learning module summarizes the knowledge learned at different time points to produce a final knowledge vector about the event. With the evolving learning, we are able to visualize the key information of the event, discover the trends, and track the development of an event.Keywords: Evolving learning, knowledge extraction, knowledge graph, text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 942183 Parameter Selections of Fuzzy C-Means Based on Robust Analysis
Authors: Kuo-Lung Wu
Abstract:
The weighting exponent m is called the fuzzifier that can have influence on the clustering performance of fuzzy c-means (FCM) and mÎ[1.5,2.5] is suggested by Pal and Bezdek [13]. In this paper, we will discuss the robust properties of FCM and show that the parameter m will have influence on the robustness of FCM. According to our analysis, we find that a large m value will make FCM more robust to noise and outliers. However, if m is larger than the theoretical upper bound proposed by Yu et al. [14], the sample mean will become the unique optimizer. Here, we suggest to implement the FCM algorithm with mÎ[1.5,4] under the restriction when m is smaller than the theoretical upper bound.Keywords: Fuzzy c-means, robust, fuzzifier.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1660182 Clustered Signatures for Modeling and Recognizing 3D Rigid Objects
Authors: H. B. Darbandi, M. R. Ito, J. Little
Abstract:
This paper describes a probabilistic method for three-dimensional object recognition using a shared pool of surface signatures. This technique uses flatness, orientation, and convexity signatures that encode the surface of a free-form object into three discriminative vectors, and then creates a shared pool of data by clustering the signatures using a distance function. This method applies the Bayes-s rule for recognition process, and it is extensible to a large collection of three-dimensional objects.Keywords: Object recognition, modeling, classification, computer vision.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1278181 A Selective Markovianity Approach for Image Segmentation
Authors: A. Melouah, H. Merouani
Abstract:
A new Markovianity approach is introduced in this paper. This approach reduces the response time of classic Markov Random Fields approach. First, one region is determinated by a clustering technique. Then, this region is excluded from the study. The remaining pixel form the study zone and they are selected for a Markovianity segmentation task. With Selective Markovianity approach, segmentation process is faster than classic one.Keywords: Markovianity, response time, segmentation, study zone.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1458180 A Parallel Implementation of k-Means in MATLAB
Authors: Dimitris Varsamis, Christos Talagkozis, Alkiviadis Tsimpiris, Paris Mastorocostas
Abstract:
The aim of this work is the parallel implementation of k-means in MATLAB, in order to reduce the execution time. Specifically, a new function in MATLAB for serial k-means algorithm is developed, which meets all the requirements for the conversion to a function in MATLAB with parallel computations. Additionally, two different variants for the definition of initial values are presented. In the sequel, the parallel approach is presented. Finally, the performance tests for the computation times respect to the numbers of features and classes are illustrated.Keywords: K-means algorithm, clustering, parallel computations, MATLAB.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1157179 Adaption Model for Building Agile Pronunciation Dictionaries Using Phonemic Distance Measurements
Authors: Akella Amarendra Babu, Rama Devi Yellasiri, Natukula Sainath
Abstract:
Where human beings can easily learn and adopt pronunciation variations, machines need training before put into use. Also humans keep minimum vocabulary and their pronunciation variations are stored in front-end of their memory for ready reference, while machines keep the entire pronunciation dictionary for ready reference. Supervised methods are used for preparation of pronunciation dictionaries which take large amounts of manual effort, cost, time and are not suitable for real time use. This paper presents an unsupervised adaptation model for building agile and dynamic pronunciation dictionaries online. These methods mimic human approach in learning the new pronunciations in real time. A new algorithm for measuring sound distances called Dynamic Phone Warping is presented and tested. Performance of the system is measured using an adaptation model and the precision metrics is found to be better than 86 percent.Keywords: Pronunciation variations, dynamic programming, machine learning, natural language processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 800178 Dynamical Analysis of Circadian Gene Expression
Authors: Carla Layana Luis Diambra
Abstract:
Microarrays technique allows the simultaneous measurements of the expression levels of thousands of mRNAs. By mining this data one can identify the dynamics of the gene expression time series. By recourse of principal component analysis, we uncover the circadian rhythmic patterns underlying the gene expression profiles from Cyanobacterium Synechocystis. We applied PCA to reduce the dimensionality of the data set. Examination of the components also provides insight into the underlying factors measured in the experiments. Our results suggest that all rhythmic content of data can be reduced to three main components.
Keywords: circadian rhythms, clustering, gene expression, PCA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1592177 Analysis of Relation between Unlabeled and Labeled Data to Self-Taught Learning Performance
Authors: Ekachai Phaisangittisagul, Rapeepol Chongprachawat
Abstract:
Obtaining labeled data in supervised learning is often difficult and expensive, and thus the trained learning algorithm tends to be overfitting due to small number of training data. As a result, some researchers have focused on using unlabeled data which may not necessary to follow the same generative distribution as the labeled data to construct a high-level feature for improving performance on supervised learning tasks. In this paper, we investigate the impact of the relationship between unlabeled and labeled data for classification performance. Specifically, we will apply difference unlabeled data which have different degrees of relation to the labeled data for handwritten digit classification task based on MNIST dataset. Our experimental results show that the higher the degree of relation between unlabeled and labeled data, the better the classification performance. Although the unlabeled data that is completely from different generative distribution to the labeled data provides the lowest classification performance, we still achieve high classification performance. This leads to expanding the applicability of the supervised learning algorithms using unsupervised learning.Keywords: Autoencoder, high-level feature, MNIST dataset, selftaught learning, supervised learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1832176 Layout Based Spam Filtering
Authors: Claudiu N.Musat
Abstract:
Due to the constant increase in the volume of information available to applications in fields varying from medical diagnosis to web search engines, accurate support of similarity becomes an important task. This is also the case of spam filtering techniques where the similarities between the known and incoming messages are the fundaments of making the spam/not spam decision. We present a novel approach to filtering based solely on layout, whose goal is not only to correctly identify spam, but also warn about major emerging threats. We propose a mathematical formulation of the email message layout and based on it we elaborate an algorithm to separate different types of emails and find the new, numerically relevant spam types.
Keywords: Clustering, layout, k-means, spam.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1649175 Web Traffic Mining using Neural Networks
Authors: Farhad F. Yusifov
Abstract:
With the explosive growth of data available on the Internet, personalization of this information space become a necessity. At present time with the rapid increasing popularity of the WWW, Websites are playing a crucial role to convey knowledge and information to the end users. Discovering hidden and meaningful information about Web users usage patterns is critical to determine effective marketing strategies to optimize the Web server usage for accommodating future growth. The task of mining useful information becomes more challenging when the Web traffic volume is enormous and keeps on growing. In this paper, we propose a intelligent model to discover and analyze useful knowledge from the available Web log data.Keywords: Clustering, Self organizing map, Web log files, Web traffic.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1603174 EEG Spikes Detection, Sorting, and Localization
Authors: Mazin Z. Othman, Maan M. Shaker, Mohammed F. Abdullah
Abstract:
This study introduces a new method for detecting, sorting, and localizing spikes from multiunit EEG recordings. The method combines the wavelet transform, which localizes distinctive spike features, with Super-Paramagnetic Clustering (SPC) algorithm, which allows automatic classification of the data without assumptions such as low variance or Gaussian distributions. Moreover, the method is capable of setting amplitude thresholds for spike detection. The method makes use of several real EEG data sets, and accordingly the spikes are detected, clustered and their times were detected.Keywords: EEG time localizations, EEG spike detection, superparamagnetic algorithm, wavelet transform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2549173 Comparison of Different Neural Network Approaches for the Prediction of Kidney Dysfunction
Authors: Ali Hussian Ali AlTimemy, Fawzi M. Al Naima
Abstract:
This paper presents the prediction of kidney dysfunction using different neural network (NN) approaches. Self organization Maps (SOM), Probabilistic Neural Network (PNN) and Multi Layer Perceptron Neural Network (MLPNN) trained with Back Propagation Algorithm (BPA) are used in this study. Six hundred and sixty three sets of analytical laboratory tests have been collected from one of the private clinical laboratories in Baghdad. For each subject, Serum urea and Serum creatinin levels have been analyzed and tested by using clinical laboratory measurements. The collected urea and cretinine levels are then used as inputs to the three NN models in which the training process is done by different neural approaches. SOM which is a class of unsupervised network whereas PNN and BPNN are considered as class of supervised networks. These networks are used as a classifier to predict whether kidney is normal or it will have a dysfunction. The accuracy of prediction, sensitivity and specificity were found for each type of the proposed networks .We conclude that PNN gives faster and more accurate prediction of kidney dysfunction and it works as promising tool for predicting of routine kidney dysfunction from the clinical laboratory data.Keywords: Kidney Dysfunction, Prediction, SOM, PNN, BPNN, Urea and Creatinine levels.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1931172 Routing Algorithm for a Clustered Network
Authors: Hemanth KumarA.R, Sudhakara G., Satyanarayana B.S.
Abstract:
The Cluster Dimension of a network is defined as, which is the minimum cardinality of a subset S of the set of nodes having the property that for any two distinct nodes x and y, there exist the node Si, s2 (need not be distinct) in S such that ld(x,s1) — d(y, s1)1 > 1 and d(x,s2) < d(x,$) for all s E S — {s2}. In this paper, strictly non overlap¬ping clusters are constructed. The concept of LandMarks for Unique Addressing and Clustering (LMUAC) routing scheme is developed. With the help of LMUAC routing scheme, It is shown that path length (upper bound)PLN,d < PLD, Maximum memory space requirement for the networkMSLmuAc(Az) < MSEmuAc < MSH3L < MSric and Maximum Link utilization factor MLLMUAC(i=3) < MLLMUAC(z03) < M Lc
Keywords: Metric dimension, Cluster dimension, Cluster.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1225171 Hydrochemical Contamination Profiling and Spatial-Temporal Mapping with the Support of Multivariate and Cluster Statistical Analysis
Authors: S. Barbosa, M. Pinto, J. A. Almeida, E. Carvalho, C. Diamantino
Abstract:
The aim of this work was to test a methodology able to generate spatial-temporal maps that can synthesize simultaneously the trends of distinct hydrochemical indicators in an old radium-uranium tailings dam deposit. Multidimensionality reduction derived from principal component analysis and subsequent data aggregation derived from clustering analysis allow to identify distinct hydrochemical behavioral profiles and generate synthetic evolutionary hydrochemical maps.
Keywords: Contamination plume migration, K-means of PCA scores, groundwater and mine water monitoring, spatial-temporal hydrochemical trends.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 624170 Product Features Extraction from Opinions According to Time
Authors: Kamal Amarouche, Houda Benbrahim, Ismail Kassou
Abstract:
Nowadays, e-commerce shopping websites have experienced noticeable growth. These websites have gained consumers’ trust. After purchasing a product, many consumers share comments where opinions are usually embedded about the given product. Research on the automatic management of opinions that gives suggestions to potential consumers and portrays an image of the product to manufactures has been growing recently. After launching the product in the market, the reviews generated around it do not usually contain helpful information or generic opinions about this product (e.g. telephone: great phone...); in the sense that the product is still in the launching phase in the market. Within time, the product becomes old. Therefore, consumers perceive the advantages/ disadvantages about each specific product feature. Therefore, they will generate comments that contain their sentiments about these features. In this paper, we present an unsupervised method to extract different product features hidden in the opinions which influence its purchase, and that combines Time Weighting (TW) which depends on the time opinions were expressed with Term Frequency-Inverse Document Frequency (TF-IDF). We conduct several experiments using two different datasets about cell phones and hotels. The results show the effectiveness of our automatic feature extraction, as well as its domain independent characteristic.
Keywords: Opinion mining, product feature extraction, sentiment analysis, SentiWordNet.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1300169 Association Rule and Decision Tree based Methodsfor Fuzzy Rule Base Generation
Authors: Ferenc Peter Pach, Janos Abonyi
Abstract:
This paper focuses on the data-driven generation of fuzzy IF...THEN rules. The resulted fuzzy rule base can be applied to build a classifier, a model used for prediction, or it can be applied to form a decision support system. Among the wide range of possible approaches, the decision tree and the association rule based algorithms are overviewed, and two new approaches are presented based on the a priori fuzzy clustering based partitioning of the continuous input variables. An application study is also presented, where the developed methods are tested on the well known Wisconsin Breast Cancer classification problem. Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2303168 A Strategy to Optimize the SPC Scheme for Mass Production of HDD Arm with ClusteringTechnique and Three-Way Control Chart
Authors: W. Chattinnawat
Abstract:
Consider a mass production of HDD arms where hundreds of CNC machines are used to manufacturer the HDD arms. According to an overwhelming number of machines and models of arm, construction of separate control chart for monitoring each HDD arm model by each machine is not feasible. This research proposed a strategy to optimize the SPC management on shop floor. The procedure started from identifying the clusters of the machine with similar manufacturing performance using clustering technique. The three way control chart ( I - MR - R ) is then applied to each clustered group of machine. This proposed research has advantageous to the manufacturer in terms of not only better performance of the SPC but also the quality management paradigm.Keywords: Three way control chart. I - MR - R , between/within variation, HDD arm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1635167 Cluster-Based Multi-Path Routing Algorithm in Wireless Sensor Networks
Authors: Si-Gwan Kim
Abstract:
Small-size and low-power sensors with sensing, signal processing and wireless communication capabilities is suitable for the wireless sensor networks. Due to the limited resources and battery constraints, complex routing algorithms used for the ad-hoc networks cannot be employed in sensor networks. In this paper, we propose node-disjoint multi-path hexagon-based routing algorithms in wireless sensor networks. We suggest the details of the algorithm and compare it with other works. Simulation results show that the proposed scheme achieves better performance in terms of efficiency and message delivery ratio.Keywords: Clustering, multi-path, routing protocol, sensor network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2470166 Accent Identification by Clustering and Scoring Formants
Authors: Dejan Stantic, Jun Jo
Abstract:
There have been significant improvements in automatic voice recognition technology. However, existing systems still face difficulties, particularly when used by non-native speakers with accents. In this paper we address a problem of identifying the English accented speech of speakers from different backgrounds. Once an accent is identified the speech recognition software can utilise training set from appropriate accent and therefore improve the efficiency and accuracy of the speech recognition system. We introduced the Q factor, which is defined by the sum of relationships between frequencies of the formants. Four different accents were considered and experimented for this research. A scoring method was introduced in order to effectively analyse accents. The proposed concept indicates that the accent could be identified by analysing their formants.Keywords: Accent Identification, Formants, Q Factor.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2089165 An Integrated Predictor for Cis-Regulatory Modules
Authors: Darby Tien-Hao Chang, Guan-Yu Shiu, You-Jie Sun
Abstract:
Various cis-regulatory module (CRM) predictors have been proposed in the last decade. Several well-established CRM predictors adopted different categories of prediction strategies, including window clustering, probabilistic modeling and phylogenetic footprinting. Appropriate integration of them has a potential to achieve high quality CRM prediction. This study analyzed four existing CRM predictors (ClusterBuster, MSCAN, CisModule and MultiModule) to seek a predictor combination that delivers a higher accuracy than individual CRM predictors. 465 CRMs across 140 Drosophila melanogaster genes from the RED fly database were used to evaluate the integrated CRM predictor proposed in this study. The results show that four predictor combinations achieved superior performance than the best individual CRM predictor.
Keywords: Cis-regulatory module, transcription factor binding site.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1650164 Searching for Similar Informational Articles in the Internet Channel
Authors: Sung Ho Ha, Seong Hyeon Joo, Hyun U. Pae
Abstract:
In terms of total online audience, newspapers are the most successful form of online content to date. The online audience for newspapers continues to demand higher-quality services, including personalized news services. News providers should be able to offer suitable users appropriate content. In this paper, a news article recommender system is suggested based on a user-s preference when he or she visits an Internet news site and reads the published articles. This system helps raise the user-s satisfaction, increase customer loyalty toward the content provider.
Keywords: Content classification, content recommendation, customer profiling, documents clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1607163 Initialization Method of Reference Vectors for Improvement of Recognition Accuracy in LVQ
Authors: Yuji Mizuno, Hiroshi Mabuchi
Abstract:
Initial values of reference vectors have significant influence on recognition accuracy in LVQ. There are several existing techniques, such as SOM and k-means, for setting initial values of reference vectors, each of which has provided some positive results. However, those results are not sufficient for the improvement of recognition accuracy. This study proposes an ACO-used method for initializing reference vectors with an aim to achieve recognition accuracy higher than those obtained through conventional methods. Moreover, we will demonstrate the effectiveness of the proposed method by applying it to the wine data and English vowel data and comparing its results with those of conventional methods.
Keywords: Clustering, LVQ, ACO, SOM, k-means.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1256162 Clustering-Based Detection of Alzheimer's Disease Using Brain MR Images
Authors: Sofia Matoug, Amr Abdel-Dayem
Abstract:
This paper presents a comprehensive survey of recent research studies to segment and classify brain MR (magnetic resonance) images in order to detect significant changes to brain ventricles. The paper also presents a general framework for detecting regions that atrophy, which can help neurologists in detecting and staging Alzheimer. Furthermore, a prototype was implemented to segment brain MR images in order to extract the region of interest (ROI) and then, a classifier was employed to differentiate between normal and abnormal brain tissues. Experimental results show that the proposed scheme can provide a reliable second opinion that neurologists can benefit from.
Keywords: Alzheimer, brain images, classification techniques, Magnetic Resonance Images, MRI.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1837161 Fast Short-Term Electrical Load Forecasting under High Meteorological Variability with a Multiple Equation Time Series Approach
Authors: Charline David, Alexandre Blondin Massé, Arnaud Zinflou
Abstract:
We present a multiple equation time series approach for the short-term load forecasting applied to the electrical power load consumption for the whole Quebec province, in Canada. More precisely, we take into account three meteorological variables — temperature, cloudiness and wind speed —, and we use meteorological measurements taken at different locations on the territory. Our final model shows an average MAPE score of 1.79% over an 8-years dataset.
Keywords: Short-term load forecasting, special days, time series, multiple equations, parallelization, clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 289160 University Ranking Systems – From League Table to Homogeneous Groups of Universities
Authors: M. Jarocka
Abstract:
The paper contains a review of the literature in terms of the critical analysis of methodologies of university ranking systems. Furthermore, the initiatives supported by the European Commission (U-Map, U-Multirank) and CHE Ranking are described. Special attention is paid to the tendencies in the development of ranking systems. According to the author, the ranking organizations should abandon the classic form of ranking, namely a hierarchical ordering of universities from “the best" to “the worse". In the empirical part of this paper, using one of the method of cluster analysis called k-means clustering, the author presents university classifications of the top universities from the Shanghai Jiao Tong University-s (SJTU) Academic Ranking of World Universities (ARWU).
Keywords: Classification, cluster analysis, ranking, university.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2744