Search results for: Biometric databases
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 341

Search results for: Biometric databases

281 How Efficiency of Password Attack Based on a Keyboard

Authors: Hsien-cheng Chou, Fei-pei Lai, Hung-chang Lee

Abstract:

At present, dictionary attack has been the basic tool for recovering key passwords. In order to avoid dictionary attack, users purposely choose another character strings as passwords. According to statistics, about 14% of users choose keys on a keyboard (Kkey, for short) as passwords. This paper develops a framework system to attack the password chosen from Kkeys and analyzes its efficiency. Within this system, we build up keyboard rules using the adjacent and parallel relationship among Kkeys and then use these Kkey rules to generate password databases by depth-first search method. According to the experiment results, we find the key space of databases derived from these Kkey rules that could be far smaller than the password databases generated within brute-force attack, thus effectively narrowing down the scope of attack research. Taking one general Kkey rule, the combinations in all printable characters (94 types) with Kkey adjacent and parallel relationship, as an example, the derived key space is about 240 smaller than those in brute-force attack. In addition, we demonstrate the method's practicality and value by successfully cracking the access password to UNIX and PC using the password databases created

Keywords: Brute-force attack, dictionary attack, depth-firstsearch, password attack.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3439
280 Performance Evaluation of Iris Region Detection and Localization for Biometric Identification System

Authors: Chit Su Htwe, Win Htay

Abstract:

The iris recognition technology is the most accurate, fast and less invasive one compared to other biometric techniques using for example fingerprints, face, retina, hand geometry, voice or signature patterns. The system developed in this study has the potential to play a key role in areas of high-risk security and can enable organizations with means allowing only to the authorized personnel a fast and secure way to gain access to such areas. The paper aim is to perform the iris region detection and iris inner and outer boundaries localization. The system was implemented on windows platform using Visual C# programming language. It is easy and efficient tool for image processing to get great performance accuracy. In particular, the system includes two main parts. The first is to preprocess the iris images by using Canny edge detection methods, segments the iris region from the rest of the image and determine the location of the iris boundaries by applying Hough transform. The proposed system tested on 756 iris images from 60 eyes of CASIA iris database images.

Keywords: Canny, C#, hough transform, image preprocessing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2044
279 BasWilCalc – Basket Willow (Salix viminalis) Biomass Yield Calculator

Authors: Wiesław Szulczewski, Wojciech Jakubowski, Andrzej Żyromski, Małgorzata Biniak-Pieróg

Abstract:

The aim of the paper was to elaborate a novel calculator BasWilCalc, that allows to estimate the actual amount of biomass on the basket willow plantations. The proposed method is based on the results of field experiment conducted during years  2011-2013 on basket willow plantation in the south-western part of Poland. As input data the results of destructive measurements of the diameter, length and weight of willow stems and non-destructive biometric measurements of diameter in the middle of stems and their length during the growing season performed at weekly intervals were used. Performed analysis enabled to develop the algorithm which, due to the fact that energy plantations are of known and constant planting structure, allows to estimate the actual amount of willow basket biomass on the plantation with a given probability and accuracy specified by the model, based on the number of stems measured and the age of the plantation.

Keywords: Basket willow (Salix viminalis) biomass, biometric measurements, yield, biomass calculator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1625
278 A Novel Framework for User-Friendly Ontology-Mediated Access to Relational Databases

Authors: Efthymios Chondrogiannis, Vassiliki Andronikou, Efstathios Karanastasis, Theodora Varvarigou

Abstract:

A large amount of data is typically stored in relational databases (DB). The latter can efficiently handle user queries which intend to elicit the appropriate information from data sources. However, direct access and use of this data requires the end users to have an adequate technical background, while they should also cope with the internal data structure and values presented. Consequently the information retrieval is a quite difficult process even for IT or DB experts, taking into account the limited contributions of relational databases from the conceptual point of view. Ontologies enable users to formally describe a domain of knowledge in terms of concepts and relations among them and hence they can be used for unambiguously specifying the information captured by the relational database. However, accessing information residing in a database using ontologies is feasible, provided that the users are keen on using semantic web technologies. For enabling users form different disciplines to retrieve the appropriate data, the design of a Graphical User Interface is necessary. In this work, we will present an interactive, ontology-based, semantically enable web tool that can be used for information retrieval purposes. The tool is totally based on the ontological representation of underlying database schema while it provides a user friendly environment through which the users can graphically form and execute their queries.

Keywords: Ontologies, Relational Databases, SPARQL, Web Interface.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1884
277 Determination of Adequate Fuzzy Inequalities for their Usage in Fuzzy Query Languages

Authors: Marcel Shirvanian, Wolfram Lippe

Abstract:

Although the usefulness of fuzzy databases has been pointed out in several works, they are not fully developed in numerous domains. A task that is mostly disregarded and which is the topic of this paper is the determination of suitable inequalities for fuzzy sets in fuzzy query languages. This paper examines which kinds of fuzzy inequalities exist at all. Afterwards, different procedures are presented that appear theoretically appropriate. By being applied to various examples, their strengths and weaknesses are revealed. Furthermore, an algorithm for an efficient computation of the selected fuzzy inequality is shown.

Keywords: Fuzzy Databases, Fuzzy Inequalities, Fuzzy QueryLanguages, Fuzzy Ranking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1303
276 A Neuron Model of Facial Recognition and Detection of an Authorized Entity Using Machine Learning System

Authors: J. K. Adedeji, M. O. Oyekanmi

Abstract:

This paper has critically examined the use of Machine Learning procedures in curbing unauthorized access into valuable areas of an organization. The use of passwords, pin codes, user’s identification in recent times has been partially successful in curbing crimes involving identities, hence the need for the design of a system which incorporates biometric characteristics such as DNA and pattern recognition of variations in facial expressions. The facial model used is the OpenCV library which is based on the use of certain physiological features, the Raspberry Pi 3 module is used to compile the OpenCV library, which extracts and stores the detected faces into the datasets directory through the use of camera. The model is trained with 50 epoch run in the database and recognized by the Local Binary Pattern Histogram (LBPH) recognizer contained in the OpenCV. The training algorithm used by the neural network is back propagation coded using python algorithmic language with 200 epoch runs to identify specific resemblance in the exclusive OR (XOR) output neurons. The research however confirmed that physiological parameters are better effective measures to curb crimes relating to identities.

Keywords: Biometric characters, facial recognition, neural network, OpenCV.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 648
275 Student Records Management System Using Smart Cards and Biometric Technology for Educational Institutions

Authors: Patrick O. Bobbie, Prince S. Attrams

Abstract:

In recent times, the rapid change in new technologies has spurred up the way and manner records are handled in educational institutions. Also, there is a need for reliable access and ease-of use to these records, resulting in increased productivity in organizations. In academic institutions, such benefits help in quality assessments, institutional performance, and assessments of teaching and evaluation methods. Students in educational institutions benefit the most when advanced technologies are deployed in accessing records. This research paper discusses the use of biometric technologies coupled with smartcard technologies to provide a unique way of identifying students and matching their data to financial records to grant them access to restricted areas such as examination halls. The system developed in this paper, has an identity verification component as part of its main functionalities. A systematic software development cycle of analysis, design, coding, testing and support was used. The system provides a secured way of verifying student’s identity and real time verification of financial records. An advanced prototype version of the system has been developed for testing purposes.

Keywords: Biometrics, fingerprints, identity-verification, smartcards.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2004
274 Using Automated Database Reverse Engineering for Database Integration

Authors: M. R. Abbasifard, M. Rahgozar, A. Bayati, P. Pournemati

Abstract:

One important problem in today organizations is the existence of non-integrated information systems, inconsistency and lack of suitable correlations between legacy and modern systems. One main solution is to transfer the local databases into a global one. In this regards we need to extract the data structures from the legacy systems and integrate them with the new technology systems. In legacy systems, huge amounts of a data are stored in legacy databases. They require particular attention since they need more efforts to be normalized, reformatted and moved to the modern database environments. Designing the new integrated (global) database architecture and applying the reverse engineering requires data normalization. This paper proposes the use of database reverse engineering in order to integrate legacy and modern databases in organizations. The suggested approach consists of methods and techniques for generating data transformation rules needed for the data structure normalization.

Keywords: Reverse Engineering, Database Integration, System Integration, Data Structure Normalization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1804
273 Fingerprint Identification using Discretization Technique

Authors: W. Y. Leng, S. M. Shamsuddin

Abstract:

Fingerprint based identification system; one of a well known biometric system in the area of pattern recognition and has always been under study through its important role in forensic science that could help government criminal justice community. In this paper, we proposed an identification framework of individuals by means of fingerprint. Different from the most conventional fingerprint identification frameworks the extracted Geometrical element features (GEFs) will go through a Discretization process. The intention of Discretization in this study is to attain individual unique features that could reflect the individual varianceness in order to discriminate one person from another. Previously, Discretization has been shown a particularly efficient identification on English handwriting with accuracy of 99.9% and on discrimination of twins- handwriting with accuracy of 98%. Due to its high discriminative power, this method is adopted into this framework as an independent based method to seek for the accuracy of fingerprint identification. Finally the experimental result shows that the accuracy rate of identification of the proposed system using Discretization is 100% for FVC2000, 93% for FVC2002 and 89.7% for FVC2004 which is much better than the conventional or the existing fingerprint identification system (72% for FVC2000, 26% for FVC2002 and 32.8% for FVC2004). The result indicates that Discretization approach manages to boost up the classification effectively, and therefore prove to be suitable for other biometric features besides handwriting and fingerprint.

Keywords: Discretization, fingerprint identification, geometrical features, pattern recognition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2310
272 Natural Language Database Interface for Selection of Data Using Grammar and Parsing

Authors: N. D. Karande, G. A. Patil

Abstract:

Databases have become ubiquitous. Almost all IT applications are storing into and retrieving information from databases. Retrieving information from the database requires knowledge of technical languages such as Structured Query Language (SQL). However majority of the users who interact with the databases do not have a technical background and are intimidated by the idea of using languages such as SQL. This has led to the development of a few Natural Language Database Interfaces (NLDBIs). A NLDBI allows the user to query the database in a natural language. This paper highlights on architecture of new NLDBI system, its implementation and discusses on results obtained. In most of the typical NLDBI systems the natural language statement is converted into an internal representation based on the syntactic and semantic knowledge of the natural language. This representation is then converted into queries using a representation converter. A natural language query is translated to an equivalent SQL query after processing through various stages. The work has been experimented on primitive database queries with certain constraints.

Keywords: Natural language database interface, representation converter, syntactic and semantic knowledge

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2654
271 A Survey on Life Science Database Citation Frequency in Scientific Literatures

Authors: Hendry Muljadi, Jiro Araki, Satoru Miyazaki, Asao Fujiyama

Abstract:

There are so many databases of various fields of life sciences available online. To find well-used databases, a survey to measure life science database citation frequency in scientific literatures is done. The survey is done by measuring how many scientific literatures which are available on PubMed Central archive cited a specific life science database. This paper presents and discusses the results of the survey.

Keywords: Life science, database, metadatabase, PubMedCentral.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1379
270 A Formal Suite of Object Relational Database Metrics

Authors: Justus S, K Iyakutti

Abstract:

Object Relational Databases (ORDB) are complex in nature than traditional relational databases because they combine the characteristics of both object oriented concepts and relational features of conventional databases. Design of an ORDB demands efficient and quality schema considering the structural, functional and componential traits. This internal quality of the schema is assured by metrics that measure the relevant attributes. This is extended to substantiate the understandability, usability and reliability of the schema, thus assuring external quality of the schema. This work institutes a formalization of ORDB metrics; metric definition, evaluation methodology and the calibration of the metric. Three ORDB schemas were used to conduct the evaluation and the formalization of the metrics. The metrics are calibrated using content and criteria related validity based on the measurability, consistency and reliability of the metrics. Nominal and summative scales are derived based on the evaluated metric values and are standardized. Future works pertaining to ORDB metrics forms the concluding note.

Keywords: Measurements, Product metrics, Metrics calibration, Object-relational database.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1620
269 Density Clustering Based On Radius of Data (DCBRD)

Authors: A.M. Fahim, A. M. Salem, F. A. Torkey, M. A. Ramadan

Abstract:

Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery of clusters with arbitrary shape and good efficiency on large databases. The well-known clustering algorithms offer no solution to the combination of these requirements. In this paper, a density based clustering algorithm (DCBRD) is presented, relying on a knowledge acquired from the data by dividing the data space into overlapped regions. The proposed algorithm discovers arbitrary shaped clusters, requires no input parameters and uses the same definitions of DBSCAN algorithm. We performed an experimental evaluation of the effectiveness and efficiency of it, and compared this results with that of DBSCAN. The results of our experiments demonstrate that the proposed algorithm is significantly efficient in discovering clusters of arbitrary shape and size.

Keywords: Clustering Algorithms, Arbitrary Shape of clusters, cluster Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1826
268 Fuzzy Processing of Uncertain Data

Authors: Petr Morávek, Miloš Šeda

Abstract:

In practice, we often come across situations where it is necessary to make decisions based on incomplete or uncertain data. In control systems it may be due to the unknown exact mathematical model, or its excessive complexity (e.g. nonlinearity) when it is necessary to simplify it, respectively, to solve it using a rule base. In the case of databases, searching data we compare a similarity measure with of the requirements of the selection with stored data, where both the select query and the data itself may contain vague terms, for example in the form of linguistic qualifiers. In this paper, we focus on the processing of uncertain data in databases and demonstrate it on the example multi-criteria decision making in the selection of variants, specified by higher number of technical parameters.

Keywords: fuzzy logic, linguistic variable, multicriteria decision

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1372
267 ECG Based Reliable User Identification Using Deep Learning

Authors: R. N. Begum, Ambalika Sharma, G. K. Singh

Abstract:

Identity theft has serious ramifications beyond data and personal information loss. This necessitates the implementation of robust and efficient user identification systems. Therefore, automatic biometric recognition systems are the need of the hour, and electrocardiogram (ECG)-based systems are unquestionably the best choice due to their appealing inherent characteristics. The Convolutional Neural Networks (CNNs) are the recent state-of-the-art techniques for ECG-based user identification systems. However, the results obtained are significantly below standards, and the situation worsens as the number of users and types of heartbeats in the dataset grows. As a result, this study proposes a highly accurate and resilient ECG-based person identification system using CNN's dense learning framework. The proposed research explores explicitly the caliber of dense CNNs in the field of ECG-based human recognition. The study tests four different configurations of dense CNN which are trained on a dataset of recordings collected from eight popular ECG databases. With the highest False Acceptance Rate (FAR)  of 0.04% and the highest False Rejection Rate (FRR)  of 5%, the best performing network achieved an identification accuracy of 99.94%. The best network is also tested with various train/test split ratios. The findings show that DenseNets are not only extremely reliable, but also highly efficient. Thus, they might also be implemented in real-time ECG-based human recognition systems.

Keywords: Biometrics, dense networks, identification rate, train/test split ratio.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 463
266 Query Optimization Techniques for XML Databases

Authors: Su Cheng Haw, G. S. V. Radha Krishna Rao

Abstract:

Over the past few years, XML (eXtensible Mark-up Language) has emerged as the standard for information representation and data exchange over the Internet. This paper provides a kick-start for new researches venturing in XML databases field. We survey the storage representation for XML document, review the XML query processing and optimization techniques with respect to the particular storage instance. Various optimization technologies have been developed to solve the query retrieval and updating problems. Towards the later year, most researchers proposed hybrid optimization techniques. Hybrid system opens the possibility of covering each technology-s weakness by its strengths. This paper reviews the advantages and limitations of optimization techniques.

Keywords: indexing, labeling scheme, query optimization, XML storage.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1993
265 Context-Aware Querying in Multimedia Databases – A Futuristic Approach

Authors: Nadeem Iftikhar, Zouhaib Zafar, Shaukat Ali

Abstract:

Efficient retrieval of multimedia objects has gained enormous focus in recent years. A number of techniques have been suggested for retrieval of textual information; however, relatively little has been suggested for efficient retrieval of multimedia objects. In this paper we have proposed a generic architecture for contextaware retrieval of multimedia objects. The proposed framework combines the well-known approaches of text-based retrieval and context-aware retrieval to formulate architecture for accurate retrieval of multimedia data.

Keywords: Context-aware retrieval, information retrieval, multimedia databases, multimedia data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1494
264 A Comparative Study of Main Memory Databases and Disk-Resident Databases

Authors: F. Raja, M.Rahgozar, N. Razavi, M. Siadaty

Abstract:

Main Memory Database systems (MMDB) store their data in main physical memory and provide very high-speed access. Conventional database systems are optimized for the particular characteristics of disk storage mechanisms. Memory resident systems, on the other hand, use different optimizations to structure and organize data, as well as to make it reliable. This paper provides a brief overview on MMDBs and one of the memory resident systems named FastDB and compares the processing time of this system with a typical disc resident database based on the results of the implementation of TPC benchmarks environment on both.

Keywords: Disk-Resident Database, FastDB, Main MemoryDatabase.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2994
263 A Framework for SQL Learning: Linking Learning Taxonomy, Cognitive Model and Cross Cutting Factors

Authors: Huda Al Shuaily, Karen Renaud

Abstract:

Databases comprise the foundation of most software systems. System developers inevitably write code to query these databases. The de facto language for querying is SQL and this, consequently, is the default language taught by higher education institutions. There is evidence that learners find it hard to master SQL, harder than mastering other programming languages such as Java. Educators do not agree about explanations for this seeming anomaly. Further investigation may well reveal the reasons. In this paper, we report on our investigations into how novices learn SQL, the actual problems they experience when writing SQL, as well as the differences between expert and novice SQL query writers. We conclude by presenting a model of SQL learning that should inform the instructional material design process better to support the SQL learning process.

Keywords: Pattern, SQL, learning, model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1288
262 Actionable Rules: Issues and New Directions

Authors: Harleen Kaur

Abstract:

Knowledge Discovery in Databases (KDD) is the process of extracting previously unknown, hidden and interesting patterns from a huge amount of data stored in databases. Data mining is a stage of the KDD process that aims at selecting and applying a particular data mining algorithm to extract an interesting and useful knowledge. It is highly expected that data mining methods will find interesting patterns according to some measures, from databases. It is of vital importance to define good measures of interestingness that would allow the system to discover only the useful patterns. Measures of interestingness are divided into objective and subjective measures. Objective measures are those that depend only on the structure of a pattern and which can be quantified by using statistical methods. While, subjective measures depend only on the subjectivity and understandability of the user who examine the patterns. These subjective measures are further divided into actionable, unexpected and novel. The key issues that faces data mining community is how to make actions on the basis of discovered knowledge. For a pattern to be actionable, the user subjectivity is captured by providing his/her background knowledge about domain. Here, we consider the actionability of the discovered knowledge as a measure of interestingness and raise important issues which need to be addressed to discover actionable knowledge.

Keywords: Data Mining Community, Knowledge Discovery inDatabases (KDD), Interestingness, Subjective Measures, Actionability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1902
261 A New Approach for Recoverable Timestamp Ordering Schedule

Authors: Hassan M. Najadat

Abstract:

A new approach for timestamp ordering problem in serializable schedules is presented. Since the number of users using databases is increasing rapidly, the accuracy and needing high throughput are main topics in database area. Strict 2PL does not allow all possible serializable schedules and so does not result high throughput. The main advantages of the approach are the ability to enforce the execution of transaction to be recoverable and the high achievable performance of concurrent execution in central databases. Comparing to Strict 2PL, the general structure of the algorithm is simple, free deadlock, and allows executing all possible serializable schedules which results high throughput. Various examples which include different orders of database operations are discussed.

Keywords: Concurrency control, schedule, timestamp, transaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2047
260 Knowledge Discovery from Production Databases for Hierarchical Process Control

Authors: Pavol Tanuska, Pavel Vazan, Michal Kebisek, Dominika Jurovata

Abstract:

The paper gives the results of the project that was oriented on the usage of knowledge discoveries from production systems for needs of the hierarchical process control. One of the main project goals was the proposal of knowledge discovery model for process control. Specifics data mining methods and techniques was used for defined problems of the process control. The gained knowledge was used on the real production system thus the proposed solution has been verified. The paper documents how is possible to apply the new discovery knowledge to use in the real hierarchical process control. There are specified the opportunities for application of the proposed knowledge discovery model for hierarchical process control.

Keywords: Hierarchical process control, knowledge discovery from databases, neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1714
259 JREM: An Approach for Formalising Models in the Requirements Phase with JSON and NoSQL Databases

Authors: Aitana Alonso-Nogueira, Helia Estévez-Fernández, Isaías García

Abstract:

This paper presents an approach to reduce some of its current flaws in the requirements phase inside the software development process. It takes the software requirements of an application, makes a conceptual modeling about it and formalizes it within JSON documents. This formal model is lodged in a NoSQL database which is document-oriented, that is, MongoDB, because of its advantages in flexibility and efficiency. In addition, this paper underlines the contributions of the detailed approach and shows some applications and benefits for the future work in the field of automatic code generation using model-driven engineering tools.

Keywords: Conceptual modeling, JSON, NoSQL databases, requirements engineering, software development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1032
258 Generating Frequent Patterns through Intersection between Transactions

Authors: M. Jamali, F. Taghiyareh

Abstract:

The problem of frequent itemset mining is considered in this paper. One new technique proposed to generate frequent patterns in large databases without time-consuming candidate generation. This technique is based on focusing on transaction instead of concentrating on itemset. This algorithm based on take intersection between one transaction and others transaction and the maximum shared items between transactions computed instead of creating itemset and computing their frequency. With applying real life transactions and some consumption is taken from real life data, the significant efficiency acquire from databases in generation association rules mining.

Keywords: Association rules, data mining, frequent patterns, shared itemset.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1357
257 Automated Knowledge Engineering

Authors: Sandeep Chandana, Rene V. Mayorga, Christine W. Chan

Abstract:

This article outlines conceptualization and implementation of an intelligent system capable of extracting knowledge from databases. Use of hybridized features of both the Rough and Fuzzy Set theory render the developed system flexibility in dealing with discreet as well as continuous datasets. A raw data set provided to the system, is initially transformed in a computer legible format followed by pruning of the data set. The refined data set is then processed through various Rough Set operators which enable discovery of parameter relationships and interdependencies. The discovered knowledge is automatically transformed into a rule base expressed in Fuzzy terms. Two exemplary cancer repository datasets (for Breast and Lung Cancer) have been used to test and implement the proposed framework.

Keywords: Knowledge Extraction, Fuzzy Sets, Rough Sets, Neuro–Fuzzy Systems, Databases

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1749
256 Development of Subjective Measures of Interestingness: From Unexpectedness to Shocking

Authors: Eiad Yafi, M. A. Alam, Ranjit Biswas

Abstract:

Knowledge Discovery of Databases (KDD) is the process of extracting previously unknown but useful and significant information from large massive volume of databases. Data Mining is a stage in the entire process of KDD which applies an algorithm to extract interesting patterns. Usually, such algorithms generate huge volume of patterns. These patterns have to be evaluated by using interestingness measures to reflect the user requirements. Interestingness is defined in different ways, (i) Objective measures (ii) Subjective measures. Objective measures such as support and confidence extract meaningful patterns based on the structure of the patterns, while subjective measures such as unexpectedness and novelty reflect the user perspective. In this report, we try to brief the more widely spread and successful subjective measures and propose a new subjective measure of interestingness, i.e. shocking.

Keywords: Shocking rules (SHR).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1495
255 A Frame Work for Query Results Refinement in Multimedia Databases

Authors: Humaira Liaquat, Nadeem Iftikhar, Shaukat Ali, Zohaib Zafar Iqbal

Abstract:

In the current age, retrieval of relevant information from massive amount of data is a challenging job. Over the years, precise and relevant retrieval of information has attained high significance. There is a growing need in the market to build systems, which can retrieve multimedia information that precisely meets the user's current needs. In this paper, we have introduced a framework for refining query results before showing it to the user, using ambient intelligence, user profile, group profile, user location, time, day, user device type and extracted features. A prototypic tool was also developed to demonstrate the efficiency of the proposed approach.

Keywords: Context aware retrieval, Information retrieval, Ambient Intelligence, Multimedia databases, User and group profile.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1500
254 A Graph-Based Approach for Placement of No-Replicated Databases in Grid

Authors: Cherif Haddad, Faouzi Ben Charrada

Abstract:

On a such wide-area environment as a Grid, data placement is an important aspect of distributed database systems. In this paper, we address the problem of initial placement of database no-replicated fragments in Grid architecture. We propose a graph based approach that considers resource restrictions. The goal is to optimize the use of computing, storage and communication resources. The proposed approach is developed in two phases: in the first phase, we perform fragment grouping using knowledge about fragments dependency and, in the second phase, we determine an efficient placement of the fragment groups on the Grid. We also show, via experimental analysis that our approach gives solutions that are close to being optimal for different databases and Grid configurations.

Keywords: Grid computing, Distributed systems, Data resourcesmanagement, Database systems, Database placement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1591
253 Visualization and Indexing of Spectral Databases

Authors: Tibor Kulcsar, Gabor Sarossy, Gabor Bereznai, Robert Auer, Janos Abonyi

Abstract:

On-line (near infrared) spectroscopy is widely used to support the operation of complex process systems. Information extracted from spectral database can be used to estimate unmeasured product properties and monitor the operation of the process. These techniques are based on looking for similar spectra by nearest neighborhood algorithms and distance based searching methods. Search for nearest neighbors in the spectral space is an NP-hard problem, the computational complexity increases by the number of points in the discrete spectrum and the number of samples in the database. To reduce the calculation time some kind of indexing could be used. The main idea presented in this paper is to combine indexing and visualization techniques to reduce the computational requirement of estimation algorithms by providing a two dimensional indexing that can also be used to visualize the structure of the spectral database. This 2D visualization of spectral database does not only support application of distance and similarity based techniques but enables the utilization of advanced clustering and prediction algorithms based on the Delaunay tessellation of the mapped spectral space. This means the prediction has not to use the high dimension space but can be based on the mapped space too. The results illustrate that the proposed method is able to segment (cluster) spectral databases and detect outliers that are not suitable for instance based learning algorithms.

Keywords: indexing high dimensional databases, dimensional reduction, clustering, similarity, k-nn algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1726
252 NOHIS-Tree: High-Dimensional Index Structure for Similarity Search

Authors: Mounira Taileb, Sami Touati

Abstract:

In Content-Based Image Retrieval systems it is important to use an efficient indexing technique in order to perform and accelerate the search in huge databases. The used indexing technique should also support the high dimensions of image features. In this paper we present the hierarchical index NOHIS-tree (Non Overlapping Hierarchical Index Structure) when we scale up to very large databases. We also present a study of the influence of clustering on search time. The performance test results show that NOHIS-tree performs better than SR-tree. Tests also show that NOHIS-tree keeps its performances in high dimensional spaces. We include the performance test that try to determine the number of clusters in NOHIS-tree to have the best search time.

Keywords: High-dimensional indexing, k-nearest neighborssearch.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1400