Search results for: knowledge discovery from databases

2152 How Efficiency of Password Attack Based on a Keyboard

Authors: Hsien-cheng Chou, Fei-pei Lai, Hung-chang Lee

Abstract:

At present, dictionary attack has been the basic tool for recovering key passwords. In order to avoid dictionary attack, users purposely choose another character strings as passwords. According to statistics, about 14% of users choose keys on a keyboard (Kkey, for short) as passwords. This paper develops a framework system to attack the password chosen from Kkeys and analyzes its efficiency. Within this system, we build up keyboard rules using the adjacent and parallel relationship among Kkeys and then use these Kkey rules to generate password databases by depth-first search method. According to the experiment results, we find the key space of databases derived from these Kkey rules that could be far smaller than the password databases generated within brute-force attack, thus effectively narrowing down the scope of attack research. Taking one general Kkey rule, the combinations in all printable characters (94 types) with Kkey adjacent and parallel relationship, as an example, the derived key space is about 240 smaller than those in brute-force attack. In addition, we demonstrate the method's practicality and value by successfully cracking the access password to UNIX and PC using the password databases created

Keywords: Brute-force attack, dictionary attack, depth-firstsearch, password attack.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3475

2151 Corporate Knowledge Communication and Knowledge Communication Difficulties

Authors: H. Buluthan Cetintas, M. Nejat Ozupek

Abstract:

Communication is an important factor and a prop in directing corporate activities efficiently, in ensuring the flow of knowledge which is necessary for the continuity of the institution, in creating a common language in the institution, in transferring corporate culture and ultimately in corporate success. The idea of transmitting the knowledge among the workers in a healthy manner has revived knowledge communication. Knowledge communication can be defined as the act of mutual creation and communication of intuitions, assessments, experiences and capabilities, as long as maintained effectively, can provide advantages such as corporate continuity, access to corporate objectives and making true administrative decisions. Although the benefits of the knowledge communication to corporations are known, and the necessary worth and care is given, some hardships may arise which makes it difficult or even block it. In this article, difficulties that prevent knowledge communication will be discussed and solutions will be proposed.

Keywords: Corporate knowledge communication, knowledge communication, knowledge communication barriers

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1436

2150 Modeling of Knowledge-Intensive Business Processes

Authors: Eckhard M. Ammann

Abstract:

Knowledge development in companies relies on knowledge-intensive business processes, which are characterized by a high complexity in their execution, weak structuring, communication-oriented tasks and high decision autonomy, and often the need for creativity and innovation. A foundation of knowledge development is provided, which is based on a new conception of knowledge and knowledge dynamics. This conception consists of a three-dimensional model of knowledge with types, kinds and qualities. Built on this knowledge conception, knowledge dynamics is modeled with the help of general knowledge conversions between knowledge assets. Here knowledge dynamics is understood to cover all of acquisition, conversion, transfer, development and usage of knowledge. Through this conception we gain a sound basis for knowledge management and development in an enterprise. Especially the type dimension of knowledge, which categorizes it according to its internality and externality with respect to the human being, is crucial for enterprise knowledge management and development, because knowledge should be made available by converting it to more external types. Built on this conception, a modeling approach for knowledgeintensive business processes is introduced, be it human-driven,e-driven or task-driven processes. As an example for this approach, a model of the creative activity for the renewal planning of a product is given.

Keywords: Conception of knowledge, knowledge dynamics, modeling notation, knowledge-intensive business processes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1837

2149 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data

Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad

Abstract:

Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars, and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.

Keywords: Remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2055

2148 Application and Limitation of Parallel Modelingin Multidimensional Sequential Pattern

Authors: Mahdi Esmaeili, Mansour Tarafdar

Abstract:

The goal of data mining algorithms is to discover useful information embedded in large databases. One of the most important data mining problems is discovery of frequently occurring patterns in sequential data. In a multidimensional sequence each event depends on more than one dimension. The search space is quite large and the serial algorithms are not scalable for very large datasets. To address this, it is necessary to study scalable parallel implementations of sequence mining algorithms. In this paper, we present a model for multidimensional sequence and describe a parallel algorithm based on data parallelism. Simulation experiments show good load balancing and scalable and acceptable speedup over different processors and problem sizes and demonstrate that our approach can works efficiently in a real parallel computing environment.

Keywords: Sequential Patterns, Data Mining, ParallelAlgorithm, Multidimensional Sequence Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1477

2147 Comparative Analysis of Diverse Collection of Big Data Analytics Tools

Authors: S. Vidhya, S. Sarumathi, N. Shanthi

Abstract:

Over the past era, there have been a lot of efforts and studies are carried out in growing proficient tools for performing various tasks in big data. Recently big data have gotten a lot of publicity for their good reasons. Due to the large and complex collection of datasets it is difficult to process on traditional data processing applications. This concern turns to be further mandatory for producing various tools in big data. Moreover, the main aim of big data analytics is to utilize the advanced analytic techniques besides very huge, different datasets which contain diverse sizes from terabytes to zettabytes and diverse types such as structured or unstructured and batch or streaming. Big data is useful for data sets where their size or type is away from the capability of traditional relational databases for capturing, managing and processing the data with low-latency. Thus the out coming challenges tend to the occurrence of powerful big data tools. In this survey, a various collection of big data tools are illustrated and also compared with the salient features.

Keywords: Big data, Big data analytics, Business analytics, Data analysis, Data visualization, Data discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3775

2146 Anomaly Detection with ANN and SVM for Telemedicine Networks

Authors: Edward Guillén, Jeisson Sánchez, Carlos Omar Ramos

Abstract:

In recent years, a wide variety of applications are developed with Support Vector Machines -SVM- methods and Artificial Neural Networks -ANN-. In general, these methods depend on intrusion knowledge databases such as KDD99, ISCX, and CAIDA among others. New classes of detectors are generated by machine learning techniques, trained and tested over network databases. Thereafter, detectors are employed to detect anomalies in network communication scenarios according to user’s connections behavior. The first detector based on training dataset is deployed in different real-world networks with mobile and non-mobile devices to analyze the performance and accuracy over static detection. The vulnerabilities are based on previous work in telemedicine apps that were developed on the research group. This paper presents the differences on detections results between some network scenarios by applying traditional detectors deployed with artificial neural networks and support vector machines.

Keywords: Anomaly detection, back-propagation neural networks, network intrusion detection systems, support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2009

2145 Implementing Knowledge Transfer Solution through Web-based Help Desk System

Authors: Mazeyanti M. Ariffin, Noreen Izza Arshad, Ainol Rahmah Shaarani, Syed Uzair Shah

Abstract:

Knowledge management is a process taking any steps that needed to get the most out of available knowledge resources. KM involved several steps; capturing the knowledge discovering new knowledge, sharing the knowledge and applied the knowledge in the decision making process. In applying the knowledge, it is not necessary for the individual that use the knowledge to comprehend it as long as the available knowledge is used in guiding the decision making and actions. When an expert is called and he provides stepby- step procedure on how to solve the problems to the caller, the expert is transferring the knowledge or giving direction to the caller. And the caller is 'applying' the knowledge by following the instructions given by the expert. An appropriate mechanism is needed to ensure effective knowledge transfer which in this case is by telephone or email. The problem with email and telephone is that the knowledge is not fully circulated and disseminated to all users. In this paper, with related experience of local university Help Desk, it is proposed the usage of Information Technology (IT)to effectively support the knowledge transfer in the organization. The issues covered include the existing knowledge, the related works, the methodology used in defining the knowledge management requirements as well the overview of the prototype.

Keywords: Knowledge Management, Knowledge Transfer, Help Desk, Web-based system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1781

2144 Effect of Incentives on Knowledge Sharing and Learning – Evidence from the Indian IT Sector

Authors: Asish O. Mathew, Lewlyn L. R. Rodrigues

Abstract:

The organizations in the knowledge economy era have recognized the importance of building knowledge assets for sustainable growth and development. In comparison to other industries, Information Technology (IT) enterprises, holds an edge in developing an effective Knowledge Management (KM) programmethanks to their in-house technological abilities. This paper tries to study the various knowledge based incentive programmes and its effect on Knowledge Sharing and Learning in the context of the Indian IT sector. A conceptual model is developed linking KM Incentives, Knowledge Sharing and Learning. A questionnaire study is conducted to collect primary data from the knowledge workers of the IT organizations located in India. The data was analysed using Structural Equation Modeling using Partial Least Square method. The results show a strong influence of knowledge management incentives on knowledge sharing and an indirect influence on learning.

Keywords: Knowledge Management, Knowledge Management Incentives, Knowledge Sharing, Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3690

2143 Architecting a Knowledge Theatre

Authors: David C. White

Abstract:

This paper describes the architectural design considerations for building a new class of application, a Personal Knowledge Integrator and a particular example a Knowledge Theatre. It then supports this description by describing a scenario of a child acquiring knowledge and how this process could be augmented by the proposed architecture and design of a Knowledge Theatre. David Merrill-s first “principles of instruction" are kept in focus to provide a background to view the learning potential.

Keywords: Knowledge, personal, open data, visualization, learning, teaching

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1338

2142 Using Data Mining in Automotive Safety

Authors: Carine Cridelich, Pablo Juesas Cano, Emmanuel Ramasso, Noureddine Zerhouni, Bernd Weiler

Abstract:

Safety is one of the most important considerations when buying a new car. While active safety aims at avoiding accidents, passive safety systems such as airbags and seat belts protect the occupant in case of an accident. In addition to legal regulations, organizations like Euro NCAP provide consumers with an independent assessment of the safety performance of cars and drive the development of safety systems in automobile industry. Those ratings are mainly based on injury assessment reference values derived from physical parameters measured in dummies during a car crash test. The components and sub-systems of a safety system are designed to achieve the required restraint performance. Sled tests and other types of tests are then carried out by car makers and their suppliers to confirm the protection level of the safety system. A Knowledge Discovery in Databases (KDD) process is proposed in order to minimize the number of tests. The KDD process is based on the data emerging from sled tests according to Euro NCAP specifications. About 30 parameters of the passive safety systems from different data sources (crash data, dummy protocol) are first analysed together with experts opinions. A procedure is proposed to manage missing data and validated on real data sets. Finally, a procedure is developed to estimate a set of rough initial parameters of the passive system before testing aiming at reducing the number of tests.

Keywords: KDD process, passive safety systems, sled test, dummy injury assessment reference values, frontal impact

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2844

2141 Determination of Adequate Fuzzy Inequalities for their Usage in Fuzzy Query Languages

Authors: Marcel Shirvanian, Wolfram Lippe

Abstract:

Although the usefulness of fuzzy databases has been pointed out in several works, they are not fully developed in numerous domains. A task that is mostly disregarded and which is the topic of this paper is the determination of suitable inequalities for fuzzy sets in fuzzy query languages. This paper examines which kinds of fuzzy inequalities exist at all. Afterwards, different procedures are presented that appear theoretically appropriate. By being applied to various examples, their strengths and weaknesses are revealed. Furthermore, an algorithm for an efficient computation of the selected fuzzy inequality is shown.

Keywords: Fuzzy Databases, Fuzzy Inequalities, Fuzzy QueryLanguages, Fuzzy Ranking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1361

2140 The Development of a Narrative Management System: Storytelling in Knowledge Management

Authors: Savita K.S, Hazwani H., Kalid K. S.

Abstract:

This paper presents a narrative management system for organizations to capture organization's tacit knowledge through stories. The intention of capturing tacit knowledge is to address the problem that comes with the mobility of workforce in organisation. Storytelling in knowledge management context is seen as a powerful management tool to communicate tacit knowledge in organization. This narrative management system is developed firstly to enable uploading of many types of knowledge sharing stories, from general to work related-specific stories and secondly, each video has comment functionality where knowledge users can post comments to other knowledge users. The narrative management system allows the stories to browse, search and view by the users. In the system, stories are stored in a video repository. Stories that were produced from this framework will improve learning, knowledge transfer facilitation and tacit knowledge quality in an organization.

Keywords: Knowledge Management, Storytelling, Stories, Tacit Knowledge

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2443

2139 M2LGP: Mining Multiple Level Gradual Patterns

Authors: Yogi Satrya Aryadinata, Anne Laurent, Michel Sala

Abstract:

Gradual patterns have been studied for many years as they contain precious information. They have been integrated in many expert systems and rule-based systems, for instance to reason on knowledge such as “the greater the number of turns, the greater the number of car crashes”. In many cases, this knowledge has been considered as a rule “the greater the number of turns → the greater the number of car crashes” Historically, works have thus been focused on the representation of such rules, studying how implication could be defined, especially fuzzy implication. These rules were defined by experts who were in charge to describe the systems they were working on in order to turn them to operate automatically. More recently, approaches have been proposed in order to mine databases for automatically discovering such knowledge. Several approaches have been studied, the main scientific topics being: how to determine what is an relevant gradual pattern, and how to discover them as efficiently as possible (in terms of both memory and CPU usage). However, in some cases, end-users are not interested in raw level knowledge, and are rather interested in trends. Moreover, it may be the case that no relevant pattern can be discovered at a low level of granularity (e.g. city), whereas some can be discovered at a higher level (e.g. county). In this paper, we thus extend gradual pattern approaches in order to consider multiple level gradual patterns. For this purpose, we consider two aggregation policies, namely horizontal and vertical.

Keywords: Gradual Pattern.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1500

2138 Specification of Agent Explicit Knowledge in Cryptographic Protocols

Authors: Khair Eddin Sabri, Ridha Khedri, Jason Jaskolka

Abstract:

Cryptographic protocols are widely used in various applications to provide secure communications. They are usually represented as communicating agents that send and receive messages. These agents use their knowledge to exchange information and communicate with other agents involved in the protocol. An agent knowledge can be partitioned into explicit knowledge and procedural knowledge. The explicit knowledge refers to the set of information which is either proper to the agent or directly obtained from other agents through communication. The procedural knowledge relates to the set of mechanisms used to get new information from what is already available to the agent. In this paper, we propose a mathematical framework which specifies the explicit knowledge of an agent involved in a cryptographic protocol. Modelling this knowledge is crucial for the specification, analysis, and implementation of cryptographic protocols. We also, report on a prototype tool that allows the representation and the manipulation of the explicit knowledge.

Keywords: Information Algebra, Agent Knowledge, CryptographicProtocols

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1474

2137 Management of Cultural Heritage: Bologna Gates

Authors: A. Ippolito, C. Bartolomei

Abstract:

A growing demand is felt today for realistic 3D models enabling the cognition and popularization of historical-artistic heritage. Evaluation and preservation of Cultural Heritage is inextricably connected with the innovative processes of gaining, managing, and using knowledge. The development and perfecting of techniques for acquiring and elaborating photorealistic 3D models, made them pivotal elements for popularizing information of objects on the scale of architectonic structures.

Keywords: Cultural heritage, databases, non-contact survey, 2D- 3D models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2250

2136 Data-organization Before Learning Multi-Entity Bayesian Networks Structure

Authors: H. Bouhamed, A. Rebai, T. Lecroq, M. Jaoua

Abstract:

The objective of our work is to develop a new approach for discovering knowledge from a large mass of data, the result of applying this approach will be an expert system that will serve as diagnostic tools of a phenomenon related to a huge information system. We first recall the general problem of learning Bayesian network structure from data and suggest a solution for optimizing the complexity by using organizational and optimization methods of data. Afterward we proposed a new heuristic of learning a Multi-Entities Bayesian Networks structures. We have applied our approach to biological facts concerning hereditary complex illnesses where the literatures in biology identify the responsible variables for those diseases. Finally we conclude on the limits arched by this work.

Keywords: Data-organization, data-optimization, automatic knowledge discovery, Multi-Entities Bayesian networks, score merging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1611

2135 From Individual Memory to Organizational Memory (Intelligence of Organizations)

Authors: A. Bencsik, 1V. Lıre, 2, I. Marosi

Abstract:

Intensive changes of environment and strong market competition have raised management of information and knowledge to the strategic level of companies. In a knowledge based economy only those organizations are capable of living which have up-to-date, special knowledge and they are able to exploit and develop it. Companies have to know what knowledge they have by taking a survey of organizational knowledge and they have to fix actual and additional knowledge in organizational memory. The question is how to identify, acquire, fix and use knowledge effectively. The paper will show that over and above the tools of information technology supporting acquisition, storage and use of information and organizational learning as well as knowledge coming into being as a result of it, fixing and storage of knowledge in the memory of a company play an important role in the intelligence of organizations and competitiveness of a company.

Keywords: Individual memory, organizational memory, knowledge management, organizational intelligence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1645

2134 Using Automated Database Reverse Engineering for Database Integration

Authors: M. R. Abbasifard, M. Rahgozar, A. Bayati, P. Pournemati

Abstract:

One important problem in today organizations is the existence of non-integrated information systems, inconsistency and lack of suitable correlations between legacy and modern systems. One main solution is to transfer the local databases into a global one. In this regards we need to extract the data structures from the legacy systems and integrate them with the new technology systems. In legacy systems, huge amounts of a data are stored in legacy databases. They require particular attention since they need more efforts to be normalized, reformatted and moved to the modern database environments. Designing the new integrated (global) database architecture and applying the reverse engineering requires data normalization. This paper proposes the use of database reverse engineering in order to integrate legacy and modern databases in organizations. The suggested approach consists of methods and techniques for generating data transformation rules needed for the data structure normalization.

Keywords: Reverse Engineering, Database Integration, System Integration, Data Structure Normalization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1853

2133 IT Management: How IT Managers Gain IT knowledge

Authors: Jes Søndergaard, Torben Tambo, Christian Koch

Abstract:

It is not a secret that, IT management has become more and more and integrated part of almost all organizations. IT managers posses an enormous amount of knowledge within both organizational knowledge and general IT knowledge. This article investigates how IT managers keep themselves updated on IT knowledge in general and looks into how much time IT managers spend on weekly basis searching the net for new or problem solving IT knowledge. The theory used in this paper is used to investigate the current role of IT managers and what issues they are facing. Furthermore a research is conducted where 7 IT managers in medium sized and large Danish companies are interviewed to add further focus on the role of the IT manager and to focus on how they keep themselves updated. Beside finding substantial need for more research, IT managers – generalists or specialists – only have limited knowledge resources at hand in updating their own knowledge – leaving much initiative to vendors.

Keywords: CIO, information Technology, Knowledge, Management, Organization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1496

2132 What Deter Academia to Share Knowledge within Research-Based University Status

Authors: S. Roziana, R. Azizah, A.R. Hamidah

Abstract:

This paper discusses the issues and challenge that academia faced in knowledge sharing at a research university in Malaysia. The partial results of interview are presented from the actual study. The main issues in knowledge sharing practices are university structure and designation and title. The academia awareness in sharing knowledge is also influenced by culture. Our investigation highlight that the concept of reciprocal relationship of sharing knowledge may hinder knowledge sharing awareness among academia. Hence, we concluded that further investigation could be carried out on the social interaction and trust culture among academia in sharing knowledge within research/ranking university environment.

Keywords: Knowledge sharing awareness, knowledge sharing practices, research university.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1750

2131 Building a Hierarchical, Granular Knowledge Cube

Authors: Alexander Denzler, Marcel Wehrle, Andreas Meier

Abstract:

A knowledge base stores facts and rules about the world that applications can use for the purpose of reasoning. By applying the concept of granular computing to a knowledge base, several advantages emerge. These can be harnessed by applications to improve their capabilities and performance. In this paper, the concept behind such a construct, called a granular knowledge cube, is defined, and its intended use as an instrument that manages to cope with different data types and detect knowledge domains is elaborated. Furthermore, the underlying architecture, consisting of the three layers of the storing, representing, and structuring of knowledge, is described. Finally, benefits as well as challenges of deploying it are listed alongside application types that could profit from having such an enhanced knowledge base.

Keywords: Granular computing, granular knowledge, hierarchical structuring, knowledge bases.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2377

2130 Knowledge Continuity as a Part of Business Continuity Management

Authors: H. Urbancova, J. Urbanec

Abstract:

Today the intangible assets are the capital of knowledge and are the most important and the most valuable resource for organizations. All employees have knowledge independently of the kind of jobs they do. Knowledge is thus an asset, which influences business operations. The objective of this article is to identify knowledge continuity as an objective of business continuity management. The article has been prepared based on the analysis of secondary sources and the evaluation of primary sources of data by means of a quantitative survey conducted in the Czech Republic. The conclusion of the article is that organizations that apply business continuity management do not focus on the preservation of the knowledge of key employees. Organizations ensure knowledge continuity only intuitively, on a random basis, non-systematically and discontinuously. The non-ensuring of knowledge continuity represents a threat of loss of key knowledge for organizations and can also negatively affect business continuity.

Keywords: Business continuity, knowledge, organizations, survey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3533

2129 A Survey on Life Science Database Citation Frequency in Scientific Literatures

Authors: Hendry Muljadi, Jiro Araki, Satoru Miyazaki, Asao Fujiyama

Abstract:

There are so many databases of various fields of life sciences available online. To find well-used databases, a survey to measure life science database citation frequency in scientific literatures is done. The survey is done by measuring how many scientific literatures which are available on PubMed Central archive cited a specific life science database. This paper presents and discusses the results of the survey.

Keywords: Life science, database, metadatabase, PubMedCentral.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1425

2128 A Formal Suite of Object Relational Database Metrics

Authors: Justus S, K Iyakutti

Abstract:

Object Relational Databases (ORDB) are complex in nature than traditional relational databases because they combine the characteristics of both object oriented concepts and relational features of conventional databases. Design of an ORDB demands efficient and quality schema considering the structural, functional and componential traits. This internal quality of the schema is assured by metrics that measure the relevant attributes. This is extended to substantiate the understandability, usability and reliability of the schema, thus assuring external quality of the schema. This work institutes a formalization of ORDB metrics; metric definition, evaluation methodology and the calibration of the metric. Three ORDB schemas were used to conduct the evaluation and the formalization of the metrics. The metrics are calibrated using content and criteria related validity based on the measurability, consistency and reliability of the metrics. Nominal and summative scales are derived based on the evaluated metric values and are standardized. Future works pertaining to ORDB metrics forms the concluding note.

Keywords: Measurements, Product metrics, Metrics calibration, Object-relational database.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1665

2127 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2612

2126 A Cumulative Learning Approach to Data Mining Employing Censored Production Rules (CPRs)

Authors: Rekha Kandwal, Kamal K.Bharadwaj

Abstract:

Knowledge is indispensable but voluminous knowledge becomes a bottleneck for efficient processing. A great challenge for data mining activity is the generation of large number of potential rules as a result of mining process. In fact sometimes result size is comparable to the original data. Traditional data mining pruning activities such as support do not sufficiently reduce the huge rule space. Moreover, many practical applications are characterized by continual change of data and knowledge, thereby making knowledge voluminous with each change. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. Michalski & Winston proposed Censored Production Rules (CPRs), as an extension of production rules, that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence, are tight or there is simply no information available as to whether it holds or not. Thus the 'If P Then D' part of the CPR expresses important information while the Unless C part acts only as a switch changes the polarity of D to ~D. In this paper a scheme based on Dempster-Shafer Theory (DST) interpretation of a CPR is suggested for discovering CPRs from the discovered flat PRs. The discovery of CPRs from flat rules would result in considerable reduction of the already discovered rules. The proposed scheme incrementally incorporates new knowledge and also reduces the size of knowledge base considerably with each episode. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested cumulative learning scheme would be useful in mining data streams.

Keywords: Censored production rules, cumulative learning, data mining, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1485

2125 Mapping Knowledge Model Onto Java Codes

Authors: B.A.Gobin, R.K.Subramanian

Abstract:

This paper gives an overview of the mapping mechanism of SEAM-a methodology for the automatic generation of knowledge models and its mapping onto Java codes. It discusses the rules that will be used to map the different components in the knowledge model automatically onto Java classes, properties and methods. The aim of developing this mechanism is to help in the creation of a prototype which will be used to validate the knowledge model which has been generated automatically. It will also help to link the modeling phase with the implementation phase as existing knowledge engineering methodologies do not provide for proper guidelines for the transition from the knowledge modeling phase to development phase. This will decrease the development overheads associated to the development of Knowledge Based Systems.

Keywords: KBS, OWL, ontology, knowledge models

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1384

2124 Text Mining Technique for Data Mining Application

Authors: M. Govindarajan

Abstract:

Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In decision tree approach is most useful in classification problem. With this technique, tree is constructed to model the classification process. There are two basic steps in the technique: building the tree and applying the tree to the database. This paper describes a proposed C5.0 classifier that performs rulesets, cross validation and boosting for original C5.0 in order to reduce the optimization of error ratio. The feasibility and the benefits of the proposed approach are demonstrated by means of medial data set like hypothyroid. It is shown that, the performance of a classifier on the training cases from which it was constructed gives a poor estimate by sampling or using a separate test file, either way, the classifier is evaluated on cases that were not used to build and evaluate the classifier are both are large. If the cases in hypothyroid.data and hypothyroid.test were to be shuffled and divided into a new 2772 case training set and a 1000 case test set, C5.0 might construct a different classifier with a lower or higher error rate on the test cases. An important feature of see5 is its ability to classifiers called rulesets. The ruleset has an error rate 0.5 % on the test cases. The standard errors of the means provide an estimate of the variability of results. One way to get a more reliable estimate of predictive is by f-fold –cross- validation. The error rate of a classifier produced from all the cases is estimated as the ratio of the total number of errors on the hold-out cases to the total number of cases. The Boost option with x trials instructs See5 to construct up to x classifiers in this manner. Trials over numerous datasets, large and small, show that on average 10-classifier boosting reduces the error rate for test cases by about 25%.

Keywords: C5.0, Error Ratio, text mining, training data, test data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2489

2123 Complexity Leadership and Knowledge Management in Higher Education

Authors: Prabhakar Venugopal Gantasala

Abstract:

Complex environments triggered by globalization have necessitated new paradigms of leadership – Complexity Leadership that encompass multiple roles that leaders need to take upon. Success of Higher Education institutions depends on how well leaders can provide adaptive, administrative and enabling leadership. Complexity Leadership seems all the more relevant for institutions that are knowledge-driven and thrive on Knowledge creation, Knowledge storage and retrieval, Knowledge Sharing and Knowledge applications. Discussed in this paper are the elements of Globalization and the opportunities and challenges that are brought forth by globalization. The Complexity leadership paradigm in a knowledge-based economy and the need for such a paradigm shift for higher education institutions is presented. Further, the paper also discusses the support the leader requires in a knowledge-driven economy through knowledge management initiatives.

Keywords: Globalization, Complexity Leadership, Knowledge Management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1796