Search results for: Document clustering
261 Encryption Image via Mutual Singular Value Decomposition
Authors: Adil Al-Rammahi
Abstract:
Image or document encryption is needed through egovernment data base. Really in this paper we introduce two matrices images, one is the public, and the second is the secret (original). The analyses of each matrix is achieved using the transformation of singular values decomposition. So each matrix is transformed or analyzed to three matrices say row orthogonal basis, column orthogonal basis, and spectral diagonal basis. Product of the two row basis is calculated. Similarly the product of the two column basis is achieved. Finally we transform or save the files of public, row product and column product. In decryption stage, the original image is deduced by mutual method of the three public files.
Keywords: Image cryptography, Singular values decomposition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2089260 A Fast Adaptive Content-based Retrieval System of Satellite Images Database using Relevance Feedback
Authors: Hanan Mahmoud Ezzat Mahmoud, Alaa Abd El Fatah Hefnawy
Abstract:
In this paper, we present a system for content-based retrieval of large database of classified satellite images, based on user's relevance feedback (RF).Through our proposed system, we divide each satellite image scene into small subimages, which stored in the database. The modified radial basis functions neural network has important role in clustering the subimages of database according to the Euclidean distance between the query feature vector and the other subimages feature vectors. The advantage of using RF technique in such queries is demonstrated by analyzing the database retrieval results.Keywords: content-based image retrieval, large database of image, RBF neural net, relevance feedback
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1473259 Human Behavior Modeling in Video Surveillance of Conference Halls
Authors: Nour Charara, Hussein Charara, Omar Abou Khaled, Hani Abdallah, Elena Mugellini
Abstract:
In this paper, we present a human behavior modeling approach in videos scenes. This approach is used to model the normal behaviors in the conference halls. We exploited the Probabilistic Latent Semantic Analysis technique (PLSA), using the 'Bag-of-Terms' paradigm, as a tool for exploring video data to learn the model by grouping similar activities. Our term vocabulary consists of 3D spatio-temporal patch groups assigned by the direction of motion. Our video representation ensures the spatial information, the object trajectory, and the motion. The main importance of this approach is that it can be adapted to detect abnormal behaviors in order to ensure and enhance human security.Keywords: Activity modeling, clustering, PLSA, video representation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 845258 Improved BEENISH Protocol for Wireless Sensor Networks Based Upon Fuzzy Inference System
Authors: Rishabh Sharma, Renu Vig, Neeraj Sharma
Abstract:
The main design parameter of WSN (wireless sensor network) is the energy consumption. To compensate this parameter, hierarchical clustering is a technique that assists in extending duration of the networks life by efficiently consuming the energy. This paper focuses on dealing with the WSNs and the FIS (fuzzy interface system) which are deployed to enhance the BEENISH protocol. The node energy, mobility, pause time and density are considered for the selection of CH (cluster head). The simulation outcomes exhibited that the projected system outperforms the traditional system with regard to the energy utilization and number of packets transmitted to sink.
Keywords: Wireless sensor network, sink, sensor node, routing protocol, fuzzy rule, fuzzy inference system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 490257 Cardiac Biosignal and Adaptation in Confined Nuclear Submarine Patrol
Authors: B. Lefranc, C. Aufauvre-Poupon, C. Martin-Krumm, M. Trousselard
Abstract:
Isolated and confined environments (ICE) present several challenges which may adversely affect human’s psychology and physiology. Submariners in Sub-Surface Ballistic Nuclear (SSBN) mission exposed to these environmental constraints must be able to perform complex tasks as part of their normal duties, as well as during crisis periods when emergency actions are required or imminent. The operational and environmental constraints they face contribute to challenge human adaptability. The impact of such a constrained environment has yet to be explored. Establishing a knowledge framework is a determining factor, particularly in view of the next long space travels. Ensuring that the crews are maintained in optimal operational conditions is a real challenge because the success of the mission depends on them. This study focused on the evaluation of the impact of stress on mental health and sensory degradation of submariners during a mission on SSBN using cardiac biosignal (heart rate variability, HRV) clustering. This is a pragmatic exploratory study of a prospective cohort included 19 submariner volunteers. HRV was recorded at baseline to classify by clustering the submariners according to their stress level based on parasympathetic (Pa) activity. Impacts of high Pa (HPa) versus low Pa (LPa) level at baseline were assessed on emotional state and sensory perception (interoception and exteroception) as a cardiac biosignal during the patrol and at a recovery time one month after. Whatever the time, no significant difference was found in mental health between groups. There are significant differences in the interoceptive, exteroceptive and physiological functioning during the patrol and at recovery time. To sum up, compared to the LPa group, the HPa maintains a higher level in psychosensory functioning during the patrol and at recovery but exhibits a decrease in Pa level. The HPa group has less adaptable HRV characteristics, less unpredictability and flexibility of cardiac biosignals while the LPa group increases them during the patrol and at recovery time. This dissociation between psychosensory and physiological adaptation suggests two treatment modalities for ICE environments. To our best knowledge, our results are the first to highlight the impact of physiological differences in the HRV profile on the adaptability of submariners. Further studies are needed to evaluate the negative emotional and cognitive effects of ICEs based on the cardiac profile. Artificial intelligence offers a promising future for maintaining high level of operational conditions. These future perspectives will not only allow submariners to be better prepared, but also to design feasible countermeasures that will help support analog environments that bring us closer to a trip to Mars.Keywords: Adaptation, exteroception, HRV, ICE, interoception, SSBN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 499256 ReSeT : Reverse Engineering System Requirements Tool
Authors: Rosziati Ibrahim, Tiu Kian Yong
Abstract:
Reverse Engineering is a very important process in Software Engineering. It can be performed backwards from system development life cycle (SDLC) in order to get back the source data or representations of a system through analysis of its structure, function and operation. We use reverse engineering to introduce an automatic tool to generate system requirements from its program source codes. The tool is able to accept the Cµ programming source codes, scan the source codes line by line and parse the codes to parser. Then, the engine of the tool will be able to generate system requirements for that specific program to facilitate reuse and enhancement of the program. The purpose of producing the tool is to help recovering the system requirements of any system when the system requirements document (SRD) does not exist due to undocumented support of the system.Keywords: System Requirements, Reverse Engineering, SourceCodes.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1679255 An Amalgam Approach for DICOM Image Classification and Recognition
Authors: J. Umamaheswari, G. Radhamani
Abstract:
This paper describes about the process of recognition and classification of brain images such as normal and abnormal based on PSO-SVM. Image Classification is becoming more important for medical diagnosis process. In medical area especially for diagnosis the abnormality of the patient is classified, which plays a great role for the doctors to diagnosis the patient according to the severeness of the diseases. In case of DICOM images it is very tough for optimal recognition and early detection of diseases. Our work focuses on recognition and classification of DICOM image based on collective approach of digital image processing. For optimal recognition and classification Particle Swarm Optimization (PSO), Genetic Algorithm (GA) and Support Vector Machine (SVM) are used. The collective approach by using PSO-SVM gives high approximation capability and much faster convergence.
Keywords: Recognition, classification, Relaxed Median Filter, Adaptive thresholding, clustering and Neural Networks
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2265254 Implementation of an IoT Sensor Data Collection and Analysis Library
Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee
Abstract:
Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.
Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2015253 Methodology of Realization for Supervisor and Simulator Dedicated to a Semiconductor Research and Production Factory
Authors: Hanane Ondella, Pierre Ladet, David Ferrand, Pat Sloan
Abstract:
In the micro and nano-technology industry, the «clean-rooms» dedicated to manufacturing chip, are equipped with the most sophisticated equipment-tools. There use a large number of resources in according to strict specifications for an optimum working and result. The distribution of «utilities» to the production is assured by teams who use a supervision tool. The studies show the interest to control the various parameters of production or/and distribution, in real time, through a reliable and effective supervision tool. This document looks at a large part of the functions that the supervisor must assure, with complementary functionalities to help the diagnosis and simulation that prove very useful in our case where the supervised installations are complexed and in constant evolution.Keywords: Control-Command, evolution, non regression, performances, real time, simulation, supervision.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1263252 Evolutionary Feature Selection for Text Documents using the SVM
Authors: Daniel I. Morariu, Lucian N. Vintan, Volker Tresp
Abstract:
Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, we present three feature selection methods: Information Gain, Support Vector Machine feature selection called (SVM_FS) and Genetic Algorithm with SVM (called GA_SVM). We show that the best results were obtained with GA_SVM method for a relatively small dimension of the feature vector.Keywords: Feature Selection, Learning with Kernels, Support Vector Machine, Genetic Algorithm, and Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1708251 Feature Selection Methods for an Improved SVM Classifier
Authors: Daniel Morariu, Lucian N. Vintan, Volker Tresp
Abstract:
Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, three feature selection methods are evaluated: Random Selection, Information Gain (IG) and Support Vector Machine feature selection (called SVM_FS). We show that the best results were obtained with SVM_FS method for a relatively small dimension of the feature vector. Also we present a novel method to better correlate SVM kernel-s parameters (Polynomial or Gaussian kernel).Keywords: Feature Selection, Learning with Kernels, SupportVector Machine, and Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1832250 Observation of the Correlations between Pair Wise Interaction and Functional Organization of the Proteins, in the Protein Interaction Network of Saccaromyces Cerevisiae
Authors: N. Tuncbag, T. Haliloglu, O. Keskin
Abstract:
Understanding the cell's large-scale organization is an interesting task in computational biology. Thus, protein-protein interactions can reveal important organization and function of the cell. Here, we investigated the correspondence between protein interactions and function for the yeast. We obtained the correlations among the set of proteins. Then these correlations are clustered using both the hierarchical and biclustering methods. The detailed analyses of proteins in each cluster were carried out by making use of their functional annotations. As a result, we found that some functional classes appear together in almost all biclusters. On the other hand, in hierarchical clustering, the dominancy of one functional class is observed. In brief, from interaction data to function, some correlated results are noticed about the relationship between interaction and function which might give clues about the organization of the proteins.Keywords: Pair-wise protein interactions, DIP database, functional correlations, biclustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1712249 Optimization of Protein Hydrolysate Production Process from Jatropha curcas Cake
Authors: Waraporn Apiwatanapiwat, Pilanee Vaithanomsat, Phanu Somkliang, Taweesiri Malapant
Abstract:
This was the first document revealing the investigation of protein hydrolysate production optimization from J. curcas cake. Proximate analysis of raw material showed 18.98% protein, 5.31% ash, 8.52% moisture and 12.18% lipid. The appropriate protein hydrolysate production process began with grinding the J. curcas cake into small pieces. Then it was suspended in 2.5% sodium hydroxide solution with ratio between solution/ J. curcas cake at 80:1 (v/w). The hydrolysis reaction was controlled at temperature 50 °C in water bath for 45 minutes. After that, the supernatant (protein hydrolysate) was separated using centrifuge at 8000g for 30 minutes. The maximum yield of resulting protein hydrolysate was 73.27 % with 7.34% moisture, 71.69% total protein, 7.12% lipid, 2.49% ash. The product was also capable of well dissolving in water.Keywords: Production, protein hydrolysate, Jatropha curcas cake, optimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1958248 Design of Personal Job Recommendation Framework on Smartphone Platform
Authors: Chayaporn Kaensar
Abstract:
Recently, Job Recommender Systems have gained much attention in industries since they solve the problem of information overload on the recruiting website. Therefore, we proposed Extended Personalized Job System that has the capability of providing the appropriate jobs for job seeker and recommending some suitable information for them using Data Mining Techniques and Dynamic User Profile. On the other hands, company can also interact to the system for publishing and updating job information. This system have emerged and supported various platforms such as web application and android mobile application. In this paper, User profiles, Implicit User Action, User Feedback, and Clustering Techniques in WEKA libraries were applied and implemented. In additions, open source tools like Yii Web Application Framework, Bootstrap Front End Framework and Android Mobile Technology were also applied.Keywords: Recommendation, user profile, data mining, web technology, mobile technology.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2154247 A Common Automated Programming Platform for Knowledge Based Software Engineering
Authors: Ivan Stanev, Maria Koleva
Abstract:
Common Platform for Automated Programming (CPAP) is defined in details. Two versions of CPAP are described: Cloud based (including set of components for classic programming, and set of components for combined programming); and Knowledge Based Automated Software Engineering (KBASE) based (including set of components for automated programming, and set of components for ontology programming). Four KBASE products (Module for Automated Programming of Robots, Intelligent Product Manual, Intelligent Document Display, and Intelligent Form Generator) are analyzed and CPAP contributions to automated programming are presented.Keywords: Automated Programming, Cloud Computing, Knowledge Based Software Engineering, Service Oriented Architecture.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1895246 A New Proxy Signature Scheme As Secure As ElGamal Signature
Authors: Song Han, Elizabeth Chang, Jie Wang, Wanquan Liu
Abstract:
Proxy signature helps the proxy signer to sign messages on behalf of the original signer. It is very useful when the original signer (e.g. the president of a company) is not available to sign a specific document. If the original signer can not forge valid proxy signatures through impersonating the proxy signer, it will be robust in a virtual environment; thus the original signer can not shift any illegal action initiated by herself to the proxy signer. In this paper, we propose a new proxy signature scheme. The new scheme can prevent the original signer from impersonating the proxy signer to sign messages. The proposed scheme is based on the regular ElGamal signature. In addition, the fair privacy of the proxy signer is maintained. That means, the privacy of the proxy signer is preserved; and the privacy can be revealed when it is necessary.Keywords: ElGamal signature, Proxy signature, Security, Hash function, Fair privacy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1601245 Using Spectral Vectors and M-Tree for Graph Clustering and Searching in Graph Databases of Protein Structures
Authors: Do Phuc, Nguyen Thi Kim Phung
Abstract:
In this paper, we represent protein structure by using graph. A protein structure database will become a graph database. Each graph is represented by a spectral vector. We use Jacobi rotation algorithm to calculate the eigenvalues of the normalized Laplacian representation of adjacency matrix of graph. To measure the similarity between two graphs, we calculate the Euclidean distance between two graph spectral vectors. To cluster the graphs, we use M-tree with the Euclidean distance to cluster spectral vectors. Besides, M-tree can be used for graph searching in graph database. Our proposal method was tested with graph database of 100 graphs representing 100 protein structures downloaded from Protein Data Bank (PDB) and we compare the result with the SCOP hierarchical structure.Keywords: Eigenvalues, m-tree, graph database, protein structure, spectra graph theory.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1658244 Securing Message in Wireless Sensor Network by using New Method of Code Conversions
Authors: Ahmed Chalak Shakir, GuXuemai, Jia Min
Abstract:
Recently, wireless sensor networks have been paid more interest, are widely used in a lot of commercial and military applications, and may be deployed in critical scenarios (e.g. when a malfunctioning network results in danger to human life or great financial loss). Such networks must be protected against human intrusion by using the secret keys to encrypt the exchange messages between communicating nodes. Both the symmetric and asymmetric methods have their own drawbacks for use in key management. Thus, we avoid the weakness of these two cryptosystems and make use of their advantages to establish a secure environment by developing the new method for encryption depending on the idea of code conversion. The code conversion-s equations are used as the key for designing the proposed system based on the basics of logic gate-s principals. Using our security architecture, we show how to reduce significant attacks on wireless sensor networks.Keywords: logic gates, code conversions, Gray-code, and clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1660243 Effects of the Stock Market Dynamic Linkages on the Central and Eastern European Capital Markets
Authors: Ioan Popa, Cristiana Tudor, Radu Lupu
Abstract:
The interdependences among stock market indices were studied for a long while by academics in the entire world. The current financial crisis opened the door to a wide range of opinions concerning the understanding and measurement of the connections considered to provide the controversial phenomenon of market integration. Using data on the log-returns of 17 stock market indices that include most of the CEE markets, from 2005 until 2009, our paper studies the problem of these dependences using a new methodological tool that takes into account both the volatility clustering effect and the stochastic properties of these linkages through a Dynamic Conditional System of Simultaneous Equations. We find that the crisis is well captured by our model as it provides evidence for the high volatility – high dependence effect.Keywords: Stock market interdependences, Dynamic System ofSimultaneous Equations, financial crisis
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1781242 NOHIS-Tree: High-Dimensional Index Structure for Similarity Search
Authors: Mounira Taileb, Sami Touati
Abstract:
In Content-Based Image Retrieval systems it is important to use an efficient indexing technique in order to perform and accelerate the search in huge databases. The used indexing technique should also support the high dimensions of image features. In this paper we present the hierarchical index NOHIS-tree (Non Overlapping Hierarchical Index Structure) when we scale up to very large databases. We also present a study of the influence of clustering on search time. The performance test results show that NOHIS-tree performs better than SR-tree. Tests also show that NOHIS-tree keeps its performances in high dimensional spaces. We include the performance test that try to determine the number of clusters in NOHIS-tree to have the best search time.Keywords: High-dimensional indexing, k-nearest neighborssearch.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1447241 A Comparative Study of Image Segmentation Algorithms
Authors: Mehdi Hosseinzadeh, Parisa Khoshvaght
Abstract:
In some applications, such as image recognition or compression, segmentation refers to the process of partitioning a digital image into multiple segments. Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images. Image segmentation is to classify or cluster an image into several parts (regions) according to the feature of image, for example, the pixel value or the frequency response. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain visual characteristics. The result of image segmentation is a set of segments that collectively cover the entire image, or a set of contours extracted from the image. Several image segmentation algorithms were proposed to segment an image before recognition or compression. Up to now, many image segmentation algorithms exist and be extensively applied in science and daily life. According to their segmentation method, we can approximately categorize them into region-based segmentation, data clustering, and edge-base segmentation. In this paper, we give a study of several popular image segmentation algorithms that are available.Keywords: Image Segmentation, hierarchical segmentation, partitional segmentation, density estimation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2920240 Brain MRI Segmentation and Lesions Detection by EM Algorithm
Authors: Mounira Rouaïnia, Mohamed Salah Medjram, Noureddine Doghmane
Abstract:
In Multiple Sclerosis, pathological changes in the brain results in deviations in signal intensity on Magnetic Resonance Images (MRI). Quantitative analysis of these changes and their correlation with clinical finding provides important information for diagnosis. This constitutes the objective of our work. A new approach is developed. After the enhancement of images contrast and the brain extraction by mathematical morphology algorithm, we proceed to the brain segmentation. Our approach is based on building statistical model from data itself, for normal brain MRI and including clustering tissue type. Then we detect signal abnormalities (MS lesions) as a rejection class containing voxels that are not explained by the built model. We validate the method on MR images of Multiple Sclerosis patients by comparing its results with those of human expert segmentation.Keywords: EM algorithm, Magnetic Resonance Imaging, Mathematical morphology, Markov random model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2169239 Constitutional Complaint as an Instrument of Fulfilling the Worker ׳s Rights in Croatian Legal System
Authors: Dragana Bjelić, Mirela Mezak Stastny
Abstract:
This paper begins with formal defining of human rights and freedoms, and the basic document regarding the said subject is undoubtedly French Declaration of the Rights of Man and of the Citizen from 789. This paper furthermore parses legal sources relevant for the workers' rights in legal system of the Republic of Croatia, international contracts and the Labour Act, which is also a master bill regarding workers' rights The authors are also dealing with issues of Constitutional Court of the Republic of Croatia and its' position in judicial system of the Republic of Croatia, as well as with the specifics of Constitutional Complaint, and the crucial part of the paper is based on the research conducted with an aim to determine implementation of rights and liberties guaranteed by the articles 54. and 55. of the Constitution of the Republic of Croatia by means of Constitutional Complaint.
Keywords: a right to work, a freedom of work, Constitutional Court of Republic of Croatia, Constitutional Complaint.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1567238 Variance Based Component Analysis for Texture Segmentation
Authors: Zeinab Ghasemi, S. Amirhassan Monadjemi, Abbas Vafaei
Abstract:
This paper presents a comparative analysis of a new unsupervised PCA-based technique for steel plates texture segmentation towards defect detection. The proposed scheme called Variance Based Component Analysis or VBCA employs PCA for feature extraction, applies a feature reduction algorithm based on variance of eigenpictures and classifies the pixels as defective and normal. While the classic PCA uses a clusterer like Kmeans for pixel clustering, VBCA employs thresholding and some post processing operations to label pixels as defective and normal. The experimental results show that proposed algorithm called VBCA is 12.46% more accurate and 78.85% faster than the classic PCA. Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1974237 Auto Classification for Search Intelligence
Authors: Lilac A. E. Al-Safadi
Abstract:
This paper proposes an auto-classification algorithm of Web pages using Data mining techniques. We consider the problem of discovering association rules between terms in a set of Web pages belonging to a category in a search engine database, and present an auto-classification algorithm for solving this problem that are fundamentally based on Apriori algorithm. The proposed technique has two phases. The first phase is a training phase where human experts determines the categories of different Web pages, and the supervised Data mining algorithm will combine these categories with appropriate weighted index terms according to the highest supported rules among the most frequent words. The second phase is the categorization phase where a web crawler will crawl through the World Wide Web to build a database categorized according to the result of the data mining approach. This database contains URLs and their categories.Keywords: Information Processing on the Web, Data Mining, Document Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1620236 Oncogene Identification using Filter based Approaches between Various Cancer Types in Lung
Authors: Michael Netzer, Michael Seger, Mahesh Visvanathan, Bernhard Pfeifer, Gerald H. Lushington, Christian Baumgartner
Abstract:
Lung cancer accounts for the most cancer related deaths for men as well as for women. The identification of cancer associated genes and the related pathways are essential to provide an important possibility in the prevention of many types of cancer. In this work two filter approaches, namely the information gain and the biomarker identifier (BMI) are used for the identification of different types of small-cell and non-small-cell lung cancer. A new method to determine the BMI thresholds is proposed to prioritize genes (i.e., primary, secondary and tertiary) using a k-means clustering approach. Sets of key genes were identified that can be found in several pathways. It turned out that the modified BMI is well suited for microarray data and therefore BMI is proposed as a powerful tool for the search for new and so far undiscovered genes related to cancer.
Keywords: lung cancer, micro arrays, data mining, feature selection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1757235 Personal Information Classification Based on Deep Learning in Automatic Form Filling System
Authors: Shunzuo Wu, Xudong Luo, Yuanxiu Liao
Abstract:
Recently, the rapid development of deep learning makes artificial intelligence (AI) penetrate into many fields, replacing manual work there. In particular, AI systems also become a research focus in the field of automatic office. To meet real needs in automatic officiating, in this paper we develop an automatic form filling system. Specifically, it uses two classical neural network models and several word embedding models to classify various relevant information elicited from the Internet. When training the neural network models, we use less noisy and balanced data for training. We conduct a series of experiments to test my systems and the results show that our system can achieve better classification results.Keywords: Personal information, deep learning, auto fill, NLP, document analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 869234 Identification of Nonlinear Systems Using Radial Basis Function Neural Network
Authors: C. Pislaru, A. Shebani
Abstract:
This paper uses the radial basis function neural network (RBFNN) for system identification of nonlinear systems. Five nonlinear systems are used to examine the activity of RBFNN in system modeling of nonlinear systems; the five nonlinear systems are dual tank system, single tank system, DC motor system, and two academic models. The feed forward method is considered in this work for modelling the non-linear dynamic models, where the KMeans clustering algorithm used in this paper to select the centers of radial basis function network, because it is reliable, offers fast convergence and can handle large data sets. The least mean square method is used to adjust the weights to the output layer, and Euclidean distance method used to measure the width of the Gaussian function.
Keywords: System identification, Nonlinear system, Neural networks, RBF neural network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2867233 Color Image Segmentation Using Competitive and Cooperative Learning Approach
Authors: Yinggan Tang, Xinping Guan
Abstract:
Color image segmentation can be considered as a cluster procedure in feature space. k-means and its adaptive version, i.e. competitive learning approach are powerful tools for data clustering. But k-means and competitive learning suffer from several drawbacks such as dead-unit problem and need to pre-specify number of cluster. In this paper, we will explore to use competitive and cooperative learning approach to perform color image segmentation. In competitive and cooperative learning approach, seed points not only compete each other, but also the winner will dynamically select several nearest competitors to form a cooperative team to adapt to the input together, finally it can automatically select the correct number of cluster and avoid the dead-units problem. Experimental results show that CCL can obtain better segmentation result.Keywords: Color image segmentation, competitive learning, cluster, k-means algorithm, competitive and cooperative learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1618232 Clustering Multivariate Empiric Characteristic Functions for Multi-Class SVM Classification
Authors: María-Dolores Cubiles-de-la-Vega, Rafael Pino-Mejías, Esther-Lydia Silva-Ramírez
Abstract:
A dissimilarity measure between the empiric characteristic functions of the subsamples associated to the different classes in a multivariate data set is proposed. This measure can be efficiently computed, and it depends on all the cases of each class. It may be used to find groups of similar classes, which could be joined for further analysis, or it could be employed to perform an agglomerative hierarchical cluster analysis of the set of classes. The final tree can serve to build a family of binary classification models, offering an alternative approach to the multi-class SVM problem. We have tested this dendrogram based SVM approach with the oneagainst- one SVM approach over four publicly available data sets, three of them being microarray data. Both performances have been found equivalent, but the first solution requires a smaller number of binary SVM models.Keywords: Cluster Analysis, Empiric Characteristic Function, Multi-class SVM, R.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1883