Search results for: process discovery.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5553

Search results for: process discovery.

5523 Actionable Rules: Issues and New Directions

Authors: Harleen Kaur

Abstract:

Knowledge Discovery in Databases (KDD) is the process of extracting previously unknown, hidden and interesting patterns from a huge amount of data stored in databases. Data mining is a stage of the KDD process that aims at selecting and applying a particular data mining algorithm to extract an interesting and useful knowledge. It is highly expected that data mining methods will find interesting patterns according to some measures, from databases. It is of vital importance to define good measures of interestingness that would allow the system to discover only the useful patterns. Measures of interestingness are divided into objective and subjective measures. Objective measures are those that depend only on the structure of a pattern and which can be quantified by using statistical methods. While, subjective measures depend only on the subjectivity and understandability of the user who examine the patterns. These subjective measures are further divided into actionable, unexpected and novel. The key issues that faces data mining community is how to make actions on the basis of discovered knowledge. For a pattern to be actionable, the user subjectivity is captured by providing his/her background knowledge about domain. Here, we consider the actionability of the discovered knowledge as a measure of interestingness and raise important issues which need to be addressed to discover actionable knowledge.

Keywords: Data Mining Community, Knowledge Discovery inDatabases (KDD), Interestingness, Subjective Measures, Actionability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1903
5522 Goal-Based Request Cloud Resource Broker in Medical Application

Authors: Mohamad Izuddin Nordin, Azween Abdullah, Mahamat Issa Hassan

Abstract:

In this paper, cloud resource broker using goalbased request in medical application is proposed. To handle recent huge production of digital images and data in medical informatics application, the cloud resource broker could be used by medical practitioner for proper process in discovering and selecting correct information and application. This paper summarizes several reviewed articles to relate medical informatics application with current broker technology and presents a research work in applying goal-based request in cloud resource broker to optimize the use of resources in cloud environment. The objective of proposing a new kind of resource broker is to enhance the current resource scheduling, discovery, and selection procedures. We believed that it could help to maximize resources allocation in medical informatics application.

Keywords: Broker, Cloud Computing, Medical Informatics, Resources Discovery, Resource Selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2013
5521 Neural-Symbolic Machine-Learning for Knowledge Discovery and Adaptive Information Retrieval

Authors: Hager Kammoun, Jean Charles Lamirel, Mohamed Ben Ahmed

Abstract:

In this paper, a model for an information retrieval system is proposed which takes into account that knowledge about documents and information need of users are dynamic. Two methods are combined, one qualitative or symbolic and the other quantitative or numeric, which are deemed suitable for many clustering contexts, data analysis, concept exploring and knowledge discovery. These two methods may be classified as inductive learning techniques. In this model, they are introduced to build “long term" knowledge about past queries and concepts in a collection of documents. The “long term" knowledge can guide and assist the user to formulate an initial query and can be exploited in the process of retrieving relevant information. The different kinds of knowledge are organized in different points of view. This may be considered an enrichment of the exploration level which is coherent with the concept of document/query structure.

Keywords: Information Retrieval Systems, machine learning, classification, Galois lattices, Self Organizing Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1150
5520 Automata Theory Approach for Solving Frequent Pattern Discovery Problems

Authors: Renáta Iváncsy, István Vajk

Abstract:

The various types of frequent pattern discovery problem, namely, the frequent itemset, sequence and graph mining problems are solved in different ways which are, however, in certain aspects similar. The main approach of discovering such patterns can be classified into two main classes, namely, in the class of the levelwise methods and in that of the database projection-based methods. The level-wise algorithms use in general clever indexing structures for discovering the patterns. In this paper a new approach is proposed for discovering frequent sequences and tree-like patterns efficiently that is based on the level-wise issue. Because the level-wise algorithms spend a lot of time for the subpattern testing problem, the new approach introduces the idea of using automaton theory to solve this problem.

Keywords: Frequent pattern discovery, graph mining, pushdownautomaton, sequence mining, state machine, tree mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1584
5519 Knowledge-Driven Decision Support System Based on Knowledge Warehouse and Data Mining by Improving Apriori Algorithm with Fuzzy Logic

Authors: Pejman Hosseinioun, Hasan Shakeri, Ghasem Ghorbanirostam

Abstract:

In recent years, we have seen an increasing importance of research and study on knowledge source, decision support systems, data mining and procedure of knowledge discovery in data bases and it is considered that each of these aspects affects the others. In this article, we have merged information source and knowledge source to suggest a knowledge based system within limits of management based on storing and restoring of knowledge to manage information and improve decision making and resources. In this article, we have used method of data mining and Apriori algorithm in procedure of knowledge discovery one of the problems of Apriori algorithm is that, a user should specify the minimum threshold for supporting the regularity. Imagine that a user wants to apply Apriori algorithm for a database with millions of transactions. Definitely, the user does not have necessary knowledge of all existing transactions in that database, and therefore cannot specify a suitable threshold. Our purpose in this article is to improve Apriori algorithm. To achieve our goal, we tried using fuzzy logic to put data in different clusters before applying the Apriori algorithm for existing data in the database and we also try to suggest the most suitable threshold to the user automatically.

Keywords: Decision support system, data mining, knowledge discovery, data discovery, fuzzy logic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2082
5518 Feature Selection with Kohonen Self Organizing Classification Algorithm

Authors: Francesco Maiorana

Abstract:

In this paper a one-dimension Self Organizing Map algorithm (SOM) to perform feature selection is presented. The algorithm is based on a first classification of the input dataset on a similarity space. From this classification for each class a set of positive and negative features is computed. This set of features is selected as result of the procedure. The procedure is evaluated on an in-house dataset from a Knowledge Discovery from Text (KDT) application and on a set of publicly available datasets used in international feature selection competitions. These datasets come from KDT applications, drug discovery as well as other applications. The knowledge of the correct classification available for the training and validation datasets is used to optimize the parameters for positive and negative feature extractions. The process becomes feasible for large and sparse datasets, as the ones obtained in KDT applications, by using both compression techniques to store the similarity matrix and speed up techniques of the Kohonen algorithm that take advantage of the sparsity of the input matrix. These improvements make it feasible, by using the grid, the application of the methodology to massive datasets.

Keywords: Clustering algorithm, Data mining, Feature selection, Grid, Kohonen Self Organizing Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3007
5517 Knowledge Discovery Techniques for Talent Forecasting in Human Resource Application

Authors: Hamidah Jantan, Abdul Razak Hamdan, Zulaiha Ali Othman

Abstract:

Human Resource (HR) applications can be used to provide fair and consistent decisions, and to improve the effectiveness of decision making processes. Besides that, among the challenge for HR professionals is to manage organization talents, especially to ensure the right person for the right job at the right time. For that reason, in this article, we attempt to describe the potential to implement one of the talent management tasks i.e. identifying existing talent by predicting their performance as one of HR application for talent management. This study suggests the potential HR system architecture for talent forecasting by using past experience knowledge known as Knowledge Discovery in Database (KDD) or Data Mining. This article consists of three main parts; the first part deals with the overview of HR applications, the prediction techniques and application, the general view of Data mining and the basic concept of talent management in HRM. The second part is to understand the use of Data Mining technique in order to solve one of the talent management tasks, and the third part is to propose the potential HR system architecture for talent forecasting.

Keywords: HR Application, Knowledge Discovery inDatabase (KDD), Talent Forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4441
5516 Predicting Protein Function using Decision Tree

Authors: Manpreet Singh, Parminder Kaur Wadhwa, Surinder Kaur

Abstract:

The drug discovery process starts with protein identification because proteins are responsible for many functions required for maintenance of life. Protein identification further needs determination of protein function. Proposed method develops a classifier for human protein function prediction. The model uses decision tree for classification process. The protein function is predicted on the basis of matched sequence derived features per each protein function. The research work includes the development of a tool which determines sequence derived features by analyzing different parameters. The other sequence derived features are determined using various web based tools.

Keywords: Sequence Derived Features, decision tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1908
5515 Web-Based Cognitive Writing Instruction (WeCWI): A Hybrid e-Framework for Instructional Design

Authors: Boon Yih Mah

Abstract:

Web-based Cognitive Writing Instruction (WeCWI) is a hybrid e-framework for the development of a web-based instruction (WBI), which contributes towards instructional design and language development. WeCWI divides its contribution in instructional design into macro and micro perspectives. In macro perspective, being a 21st century educator by disseminating knowledge and sharing ideas with the in-class and global learners is initiated. By leveraging the virtue of technology, WeCWI aims to transform an educator into an aggregator, curator, publisher, social networker and ultimately, a web-based instructor. Since the most notable contribution of integrating technology is being a tool of teaching as well as a stimulus for learning, WeCWI focuses on the use of contemporary web tools based on the multiple roles played by the 21st century educator. The micro perspective in instructional design draws attention to the pedagogical approaches focusing on three main aspects: reading, discussion, and writing. With the effective use of pedagogical approaches through free reading and enterprises, technology adds new dimensions and expands the boundaries of learning capacity. Lastly, WeCWI also imparts the fundamental theories and models for web-based instructors’ awareness such as interactionist theory, cognitive information processing (CIP) theory, computer-mediated communication (CMC), e-learning interactionalbased model, inquiry models, sensory mind model, and leaning styles model.

Keywords: WeCWI, instructional discovery, technological discovery, pedagogical discovery, theoretical discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2202
5514 Discovery of Quantified Hierarchical Production Rules from Large Set of Discovered Rules

Authors: Tamanna Siddiqui, M. Afshar Alam

Abstract:

Automated discovery of Rule is, due to its applicability, one of the most fundamental and important method in KDD. It has been an active research area in the recent past. Hierarchical representation allows us to easily manage the complexity of knowledge, to view the knowledge at different levels of details, and to focus our attention on the interesting aspects only. One of such efficient and easy to understand systems is Hierarchical Production rule (HPRs) system. A HPR, a standard production rule augmented with generality and specificity information, is of the following form: Decision If < condition> Generality Specificity . HPRs systems are capable of handling taxonomical structures inherent in the knowledge about the real world. This paper focuses on the issue of mining Quantified rules with crisp hierarchical structure using Genetic Programming (GP) approach to knowledge discovery. The post-processing scheme presented in this work uses Quantified production rules as initial individuals of GP and discovers hierarchical structure. In proposed approach rules are quantified by using Dempster Shafer theory. Suitable genetic operators are proposed for the suggested encoding. Based on the Subsumption Matrix(SM), an appropriate fitness function is suggested. Finally, Quantified Hierarchical Production Rules (HPRs) are generated from the discovered hierarchy, using Dempster Shafer theory. Experimental results are presented to demonstrate the performance of the proposed algorithm.

Keywords: Knowledge discovery in database, quantification, dempster shafer theory, genetic programming, hierarchy, subsumption matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1488
5513 Development of Subjective Measures of Interestingness: From Unexpectedness to Shocking

Authors: Eiad Yafi, M. A. Alam, Ranjit Biswas

Abstract:

Knowledge Discovery of Databases (KDD) is the process of extracting previously unknown but useful and significant information from large massive volume of databases. Data Mining is a stage in the entire process of KDD which applies an algorithm to extract interesting patterns. Usually, such algorithms generate huge volume of patterns. These patterns have to be evaluated by using interestingness measures to reflect the user requirements. Interestingness is defined in different ways, (i) Objective measures (ii) Subjective measures. Objective measures such as support and confidence extract meaningful patterns based on the structure of the patterns, while subjective measures such as unexpectedness and novelty reflect the user perspective. In this report, we try to brief the more widely spread and successful subjective measures and propose a new subjective measure of interestingness, i.e. shocking.

Keywords: Shocking rules (SHR).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1497
5512 Routing Load Analysis over 802.11 DCF of Reactive Routing Protocols DSR and DYMO

Authors: Parma Nand, S.C. Sharma

Abstract:

The Mobile Ad-hoc Network (MANET) is a collection of self-configuring and rapidly deployed mobile nodes (routers) without any central infrastructure. Routing is one of the potential issues. Many routing protocols are reported but it is difficult to decide which one is best in all scenarios. In this paper on demand routing protocols DSR and DYMO based on IEEE 802.11 DCF MAC protocol are examined and characteristic summary of these routing protocols is presented. Their performance is analyzed and compared on performance measuring metrics throughput, dropped packets due to non availability of routes, duplicate RREQ generated for route discovery and normalized routing load by varying CBR data traffic load using QualNet 5.0.2 network simulator.

Keywords: Adhoc networks, wireless networks, CBR, routingprotocols, route discovery, simulation, performance evaluation, MAC, IEEE 802.11.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1681
5511 Social Semantic Web-Based Analytics Approach to Support Lifelong Learning

Authors: Khaled Halimi, Hassina Seridi-Bouchelaghem

Abstract:

The purpose of this paper is to describe how learning analytics approaches based on social semantic web techniques can be applied to enhance the lifelong learning experiences in a connectivist perspective. For this reason, a prototype of a system called SoLearn (Social Learning Environment) that supports this approach. We observed and studied literature related to lifelong learning systems, social semantic web and ontologies, connectivism theory, learning analytics approaches and reviewed implemented systems based on these fields to extract and draw conclusions about necessary features for enhancing the lifelong learning process. The semantic analytics of learning can be used for viewing, studying and analysing the massive data generated by learners, which helps them to understand through recommendations, charts and figures their learning and behaviour, and to detect where they have weaknesses or limitations. This paper emphasises that implementing a learning analytics approach based on social semantic web representations can enhance the learning process. From one hand, the analysis process leverages the meaning expressed by semantics presented in the ontology (relationships between concepts). From the other hand, the analysis process exploits the discovery of new knowledge by means of inferring mechanism of the semantic web.

Keywords: Connectivism, data visualization, informal learning, learning analytics, semantic web, social web.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 768
5510 Identification of Arglecins B and C and Actinofuranosin A from a Termite Gut-Associated Streptomyces Species

Authors: Christian A. Romero, Tanja Grkovic, John. R. J. French, D. İpek. Kurtböke, Ronald J. Quinn

Abstract:

A high-throughput and automated 1H NMR metabolic fingerprinting dereplication approach was used to accelerate the discovery of unknown bioactive secondary metabolites. The applied dereplication strategy accelerated the discovery of new natural products, provided rapid and competent identification and quantification of the known secondary metabolites and avoided time-consuming isolation procedures. The effectiveness of the technique was demonstrated by the isolation and elucidation of arglecins B (1), C (2) and actinofuranosin A (3) from a termite-gut associated Streptomyces sp. (USC 597) grown under solid state fermentation. The structures of these compounds were elucidated by extensive interpretation of 1H, 13C and 2D NMR spectroscopic data. These represent the first report of arglecin analogues isolated from a termite gut-associated Streptomyces species.

Keywords: Actinomycetes, actinofuranosin, antibiotics, arglecins, NMR spectroscopy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 107
5509 Device Discover: A Component for Network Management System using Simple Network Management Protocol

Authors: Garima Gupta, Daya Gupta

Abstract:

Virtually all existing networked system management tools use a Manager/Agent paradigm. That is, distributed agents are deployed on managed devices to collect local information and report it back to some management unit. Even those that use standard protocols such as SNMP fall into this model. Using standard protocol has the advantage of interoperability among devices from different vendors. However, it may not be able to provide customized information that is of interest to satisfy specific management needs. In this dissertation work, different approaches are used to collect information regarding the devices attached to a Local Area Network. An SNMP aware application is being developed that will manage the discovery procedure and will be used as data collector.

Keywords: ICMP Scanner, Network Discovery, NetworkManagement, SNMP Scanner.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1629
5508 Applications of Genetic Programming in Data Mining

Authors: Saleh Mesbah Elkaffas, Ahmed A. Toony

Abstract:

This paper details the application of a genetic programming framework for induction of useful classification rules from a database of income statements, balance sheets, and cash flow statements for North American public companies. Potentially interesting classification rules are discovered. Anomalies in the discovery process merit further investigation of the application of genetic programming to the dataset for the problem domain.

Keywords: Genetic programming, data mining classification rule.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1511
5507 On the Move to Semantic Web Services

Authors: Jorge Cardoso

Abstract:

Semantic Web services will enable the semiautomatic and automatic annotation, advertisement, discovery, selection, composition, and execution of inter-organization business logic, making the Internet become a common global platform where organizations and individuals communicate with each other to carry out various commercial activities and to provide value-added services. There is a growing consensus that Web services alone will not be sufficient to develop valuable solutions due the degree of heterogeneity, autonomy, and distribution of the Web. This paper deals with two of the hottest R&D and technology areas currently associated with the Web – Web services and the Semantic Web. It presents the synergies that can be created between Web Services and Semantic Web technologies to provide a new generation of eservices.

Keywords: Semantic Web, Web service, Web process, WWW.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1246
5506 Machine Learning Methods for Network Intrusion Detection

Authors: Mouhammad Alkasassbeh, Mohammad Almseidin

Abstract:

Network security engineers work to keep services available all the time by handling intruder attacks. Intrusion Detection System (IDS) is one of the obtainable mechanisms that is used to sense and classify any abnormal actions. Therefore, the IDS must be always up to date with the latest intruder attacks signatures to preserve confidentiality, integrity, and availability of the services. The speed of the IDS is a very important issue as well learning the new attacks. This research work illustrates how the Knowledge Discovery and Data Mining (or Knowledge Discovery in Databases) KDD dataset is very handy for testing and evaluating different Machine Learning Techniques. It mainly focuses on the KDD preprocess part in order to prepare a decent and fair experimental data set. The J48, MLP, and Bayes Network classifiers have been chosen for this study. It has been proven that the J48 classifier has achieved the highest accuracy rate for detecting and classifying all KDD dataset attacks, which are of type DOS, R2L, U2R, and PROBE.

Keywords: IDS, DDoS, MLP, KDD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 682
5505 Data Mining for Cancer Management in Egypt Case Study: Childhood Acute Lymphoblastic Leukemia

Authors: Nevine M. Labib, Michael N. Malek

Abstract:

Data Mining aims at discovering knowledge out of data and presenting it in a form that is easily comprehensible to humans. One of the useful applications in Egypt is the Cancer management, especially the management of Acute Lymphoblastic Leukemia or ALL, which is the most common type of cancer in children. This paper discusses the process of designing a prototype that can help in the management of childhood ALL, which has a great significance in the health care field. Besides, it has a social impact on decreasing the rate of infection in children in Egypt. It also provides valubale information about the distribution and segmentation of ALL in Egypt, which may be linked to the possible risk factors. Undirected Knowledge Discovery is used since, in the case of this research project, there is no target field as the data provided is mainly subjective. This is done in order to quantify the subjective variables. Therefore, the computer will be asked to identify significant patterns in the provided medical data about ALL. This may be achieved through collecting the data necessary for the system, determimng the data mining technique to be used for the system, and choosing the most suitable implementation tool for the domain. The research makes use of a data mining tool, Clementine, so as to apply Decision Trees technique. We feed it with data extracted from real-life cases taken from specialized Cancer Institutes. Relevant medical cases details such as patient medical history and diagnosis are analyzed, classified, and clustered in order to improve the disease management.

Keywords: Data Mining, Decision Trees, Knowledge Discovery, Leukemia.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2168
5504 Genetic Algorithms in Hot Steel Rolling for Scale Defect Prediction

Authors: Jarno Haapamäki, Juha Röning

Abstract:

Scale defects are common surface defects in hot steel rolling. The modelling of such defects is problematic and their causes are not straightforward. In this study, we investigated genetic algorithms in search for a mathematical solution to scale formation. For this research, a high-dimensional data set from hot steel rolling process was gathered. The synchronisation of the variables as well as the allocation of the measurements made on the steel strip were solved before the modelling phase.

Keywords: Genetic algorithms, hot strip rolling, knowledge discovery, modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3258
5503 Customers’ Priority to Implement SSTs Using AHP Analysis

Authors: Mohammad Jafariahangari, Marjan Habibi, Miresmaeil Mirnabibaboli, Mirza Hassan Hosseini

Abstract:

Self-service technologies (SSTs) make an important contribution to the daily life of people nowadays. However, the introduction of SST does not lead to its usage. Thereby, this paper was an attempt on discovery of the most preferred SST in the customers’ point of view. To fulfill this aim, the Analytical Hierarchy Process (AHP) was applied based on Saaty’s questionnaire which was administered to the customers of e-banking services located in Golestan providence, northern Iran. This study used qualitative factors in association with the intention of consumers’ usage of SSTs to rank three SSTs: ATM, mobile banking and internet banking. The results showed that mobile banking get the highest weight in consumers’ point of view. This research can be useful both for managers and service providers and also for customers who intend to use e-banking.

Keywords: Analytical Hierarchy Process, Decision-making, Ebanking, Iran, Self-service technologies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2170
5502 Discovery and Capture of Organizational Knowledge from Unstructured Information

Authors: J. Gu, W.B. Lee, C.F. Cheung, E. Tsui, W.M. Wang

Abstract:

Knowledge of an organization does not merely reside in structured form of information and data; it is also embedded in unstructured form. The discovery of such knowledge is particularly difficult as the characteristic is dynamic, scattered, massive and multiplying at high speed. Conventional methods of managing unstructured information are considered too resource demanding and time consuming to cope with the rapid information growth. In this paper, a Multi-faceted and Automatic Knowledge Elicitation System (MAKES) is introduced for the purpose of discovery and capture of organizational knowledge. A trial implementation has been conducted in a public organization to achieve the objective of decision capture and navigation from a number of meeting minutes which are autonomously organized, classified and presented in a multi-faceted taxonomy map in both document and content level. Key concepts such as critical decision made, key knowledge workers, knowledge flow and the relationship among them are elicited and displayed in predefined knowledge model and maps. Hence, the structured knowledge can be retained, shared and reused. Conducting Knowledge Management with MAKES reduces work in searching and retrieving the target decision, saves a great deal of time and manpower, and also enables an organization to keep pace with the knowledge life cycle. This is particularly important when the amount of unstructured information and data grows extremely quickly. This system approach of knowledge management can accelerate value extraction and creation cycles of organizations.

Keywords: Knowledge-Based System, Knowledge Elicitation, Knowledge Management, Taxonomy, Unstructured Information Management

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1803
5501 FCA-based Conceptual Knowledge Discovery in Folksonomy

Authors: Yu-Kyung Kang, Suk-Hyung Hwang, Kyoung-Mo Yang

Abstract:

The tagging data of (users, tags and resources) constitutes a folksonomy that is the user-driven and bottom-up approach to organizing and classifying information on the Web. Tagging data stored in the folksonomy include a lot of very useful information and knowledge. However, appropriate approach for analyzing tagging data and discovering hidden knowledge from them still remains one of the main problems on the folksonomy mining researches. In this paper, we have proposed a folksonomy data mining approach based on FCA for discovering hidden knowledge easily from folksonomy. Also we have demonstrated how our proposed approach can be applied in the collaborative tagging system through our experiment. Our proposed approach can be applied to some interesting areas such as social network analysis, semantic web mining and so on.

Keywords: Folksonomy data mining, formal concept analysis, collaborative tagging, conceptual knowledge discovery, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1989
5500 Pattern Discovery from Student Feedback: Identifying Factors to Improve Student Emotions in Learning

Authors: Angelina A. Tzacheva, Jaishree Ranganathan

Abstract:

Interest in (STEM) Science Technology Engineering Mathematics education especially Computer Science education has seen a drastic increase across the country. This fuels effort towards recruiting and admitting a diverse population of students. Thus the changing conditions in terms of the student population, diversity and the expected teaching and learning outcomes give the platform for use of Innovative Teaching models and technologies. It is necessary that these methods adapted should also concentrate on raising quality of such innovations and have positive impact on student learning. Light-Weight Team is an Active Learning Pedagogy, which is considered to be low-stake activity and has very little or no direct impact on student grades. Emotion plays a major role in student’s motivation to learning. In this work we use the student feedback data with emotion classification using surveys at a public research institution in the United States. We use Actionable Pattern Discovery method for this purpose. Actionable patterns are patterns that provide suggestions in the form of rules to help the user achieve better outcomes. The proposed method provides meaningful insight in terms of changes that can be incorporated in the Light-Weight team activities, resources utilized in the course. The results suggest how to enhance student emotions to a more positive state, in particular focuses on the emotions ‘Trust’ and ‘Joy’.

Keywords: Actionable pattern discovery, education, emotion, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 465
5499 Link Availability Estimation for Modified AOMDV Protocol

Authors: R. Prabha, N. Ramaraj

Abstract:

Routing in adhoc networks is a challenge as nodes are mobile, and links are constantly created and broken. Present ondemand adhoc routing algorithms initiate route discovery after a path breaks, incurring significant cost to detect disconnection and establish a new route. Specifically, when a path is about to be broken, the source is warned of the likelihood of a disconnection. The source then initiates path discovery early, avoiding disconnection totally. A path is considered about to break when link availability decreases. This study modifies Adhoc On-demand Multipath Distance Vector routing (AOMDV) so that route handoff occurs through link availability estimation.

Keywords: Mobile Adhoc Network (MANET), Routing, Adhoc On-demand Multipath Distance Vector routing (AOMDV), Link Availability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1578
5498 Incremental Mining of Shocking Association Patterns

Authors: Eiad Yafi, Ahmed Sultan Al-Hegami, M. A. Alam, Ranjit Biswas

Abstract:

Association rules are an important problem in data mining. Massively increasing volume of data in real life databases has motivated researchers to design novel and incremental algorithms for association rules mining. In this paper, we propose an incremental association rules mining algorithm that integrates shocking interestingness criterion during the process of building the model. A new interesting measure called shocking measure is introduced. One of the main features of the proposed approach is to capture the user background knowledge, which is monotonically augmented. The incremental model that reflects the changing data and the user beliefs is attractive in order to make the over all KDD process more effective and efficient. We implemented the proposed approach and experiment it with some public datasets and found the results quite promising.

Keywords: Knowledge discovery in databases (KDD), Data mining, Incremental Association rules, Domain knowledge, Interestingness, Shocking rules (SHR).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1837
5497 Approaches to Developing Semantic Web Services

Authors: Jorge Cardoso

Abstract:

It has been recognized that due to the autonomy and heterogeneity, of Web services and the Web itself, new approaches should be developed to describe and advertise Web services. The most notable approaches rely on the description of Web services using semantics. This new breed of Web services, termed semantic Web services, will enable the automatic annotation, advertisement, discovery, selection, composition, and execution of interorganization business logic, making the Internet become a common global platform where organizations and individuals communicate with each other to carry out various commercial activities and to provide value-added services. This paper deals with two of the hottest R&D and technology areas currently associated with the Web – Web services and the semantic Web. It describes how semantic Web services extend Web services as the semantic Web improves the current Web, and presents three different conceptual approaches to deploying semantic Web services, namely, WSDL-S, OWL-S, and WSMO.

Keywords: Semantic Web, Web service, Web process, WWW

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1391
5496 An Efficient Framework to Build Up Malware Dataset

Authors: Madihah Mohd Saudi, Zul Hilmi Abdullah

Abstract:

This research paper presents a framework on how to build up malware dataset.Many researchers took longer time to clean the dataset from any noise or to transform the dataset into a format that can be used straight away for testing. Therefore, this research is proposing a framework to help researchers to speed up the malware dataset cleaningprocesses which later can be used for testing. It is believed, an efficient malware dataset cleaning processes, can improved the quality of the data, thus help to improve the accuracy and the efficiency of the subsequent analysis. Apart from that, an in-depth understanding of the malware taxonomy is also important prior and during the dataset cleaning processes. A new Trojan classification has been proposed to complement this framework.This experiment has been conducted in a controlled lab environment and using the dataset from VxHeavens dataset. This framework is built based on the integration of static and dynamic analyses, incident response method and knowledge database discovery (KDD) processes.This framework can be used as the basis guideline for malware researchers in building malware dataset.

Keywords: Dataset, knowledge database discovery (KDD), malware, static and dynamic analyses.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3429
5495 Effective Keyword and Similarity Thresholds for the Discovery of Themes from the User Web Access Patterns

Authors: Haider A Ramadhan, Khalil Shihab

Abstract:

Clustering techniques have been used by many intelligent software agents to group similar access patterns of the Web users into high level themes which express users intentions and interests. However, such techniques have been mostly focusing on one salient feature of the Web document visited by the user, namely the extracted keywords. The major aim of these techniques is to come up with an optimal threshold for the number of keywords needed to produce more focused themes. In this paper we focus on both keyword and similarity thresholds to generate themes with concentrated themes, and hence build a more sound model of the user behavior. The purpose of this paper is two fold: use distance based clustering methods to recognize overall themes from the Proxy log file, and suggest an efficient cut off levels for the keyword and similarity thresholds which tend to produce more optimal clusters with better focus and efficient size.

Keywords: Data mining, knowledge discovery, clustering, dataanalysis, Web log analysis, theme based searching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1412
5494 A Fitted Random Sampling Scheme for Load Distribution in Grid Networks

Authors: O. A. Rahmeh, P. Johnson, S. Lehmann

Abstract:

Grid networks provide the ability to perform higher throughput computing by taking advantage of many networked computer-s resources to solve large-scale computation problems. As the popularity of the Grid networks has increased, there is a need to efficiently distribute the load among the resources accessible on the network. In this paper, we present a stochastic network system that gives a distributed load-balancing scheme by generating almost regular networks. This network system is self-organized and depends only on local information for load distribution and resource discovery. The in-degree of each node is refers to its free resources, and job assignment and resource discovery processes required for load balancing is accomplished by using fitted random sampling. Simulation results show that the generated network system provides an effective, scalable, and reliable load-balancing scheme for the distributed resources accessible on Grid networks.

Keywords: Complex networks, grid networks, load-balancing, random sampling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1742