Search results for: Database systems
4929 Data Migration Methodology from Relational to NoSQL Databases
Authors: Mohamed Hanine, Abdesadik Bendarag, Omar Boutkhoum
Abstract:
Currently, the field of data migration is very topical. As the number of applications developed rapidly, the ever-increasing volume of data collected has driven the architectural migration from Relational Database Management System (RDBMS) to NoSQL (Not Only SQL) database. This very recent technology is important enough in the field of database management. The main aim of this paper is to present a methodology for data migration from RDBMS to NoSQL database. To illustrate this methodology, we implement a software prototype using MySQL as a RDBMS and MongoDB as a NoSQL database. Although this is a hard engineering work, our results show that the proposed methodology can successfully accomplish the goal of this study.Keywords: Data Migration, MySQL, RDBMS, NoSQL, MongoDB.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 43674928 A New Color Image Database for Benchmarking of Automatic Face Detection and Human Skin Segmentation Techniques
Authors: Abdallah S. Abdallah, Mohamad A bou El-Nasr, A. Lynn Abbott
Abstract:
This paper presents a new color face image database for benchmarking of automatic face detection algorithms and human skin segmentation techniques. It is named the VT-AAST image database, and is divided into four parts. Part one is a set of 286 color photographs that include a total of 1027 faces in the original format given by our digital cameras, offering a wide range of difference in orientation, pose, environment, illumination, facial expression and race. Part two contains the same set in a different file format. The third part is a set of corresponding image files that contain human colored skin regions resulting from a manual segmentation procedure. The fourth part of the database has the same regions converted into grayscale. The database is available on-line for noncommercial use. In this paper, descriptions of the database development, organization, format as well as information needed for benchmarking of algorithms are depicted in detail.Keywords: Image database, color image analysis, facedetection, skin segmentation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25884927 Implementing a Database from a Requirement Specification
Abstract:
Creating a database scheme is essentially a manual process. From a requirement specification the information contained within has to be analyzed and reduced into a set of tables, attributes and relationships. This is a time consuming process that has to go through several stages before an acceptable database schema is achieved. The purpose of this paper is to implement a Natural Language Processing (NLP) based tool to produce a relational database from a requirement specification. The Stanford CoreNLP version 3.3.1 and the Java programming were used to implement the proposed model. The outcome of this study indicates that a first draft of a relational database schema can be extracted from a requirement specification by using NLP tools and techniques with minimum user intervention. Therefore this method is a step forward in finding a solution that requires little or no user intervention.
Keywords: Information Extraction, Natural Language Processing, Relation Extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22264926 Effect of Network Communication Overhead on the Performance of Adaptive Speculative Locking Protocol
Authors: Waqar Haque, Pai Qi
Abstract:
The speculative locking (SL) protocol extends the twophase locking (2PL) protocol to allow for parallelism among conflicting transactions. The adaptive speculative locking (ASL) protocol provided further enhancements and outperformed SL protocols under most conditions. Neither of these protocols consider the impact of network latency on the performance of the distributed database systems. We have studied the performance of ASL protocol taking into account the communication overhead. The results indicate that though system load can counter network latency, it can still become a bottleneck in many situations. The impact of latency on performance depends on many factors including the system resources. A flexible discrete event simulator was used as the testbed for this study.
Keywords: concurrency control, distributed database systems, speculative locking
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16974925 The Video Database for Teaching and Learning in Football Refereeing
Authors: M. Armenteros, A. Domínguez, M. Fernández, A. J. Benítez
Abstract:
The following paper describes the video database tool used by the Fédération Internationale de Football Association (FIFA) as part of the research project developed in collaboration with the Carlos III University of Madrid. The database project began in 2012, with the aim of creating an educational tool for the training of instructors, referees and assistant referees, and it has been used in all FUTURO III courses since 2013. The platform now contains 3,135 video clips of different match situations from FIFA competitions. It has 1,835 users (FIFA instructors, referees and assistant referees). In this work, the main features of the database are described, such as the use of a search tool and the creation of multimedia presentations and video quizzes. The database has been developed in MySQL, ActionScript, Ruby on Rails and HTML. This tool has been rated by users as "very good" in all courses, which prompt us to introduce it as an ideal tool for any other sport that requires the use of video analysis.Keywords: Video database, FIFA, refereeing, e-learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13164924 Application of Company Financial Crisis Early Warning Model- Use of “Financial Reference Database“
Authors: Chiung-ying Lee, Chia-hua Chang
Abstract:
In July 1, 2007, Taiwan Stock Exchange (TWSE) on market observation post system (MOPS) adds a new "Financial reference database" for investors to do investment reference. This database as a warning to public offering companies listed on the public financial information and it original within eight targets. In this paper, this database provided by the indicators for the application of company financial crisis early warning model verify that the database provided by the indicator forecast for the financial crisis, whether or not companies have a high accuracy rate as opposed to domestic and foreign scholars have positive results. There is use of Logistic Regression Model application of the financial early warning model, in which no joined back-conditions is the first model, joined it in is the second model, has been taken occurred in the financial crisis of companies to research samples and then business took place before the financial crisis point with T-1 and T-2 sample data to do positive analysis. The results show that this database provided the debt ratio and net per share for the best forecast variables.Keywords: Financial reference database, Financial early warning model, Logistic Regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14274923 Materialized View Effect on Query Performance
Authors: Yusuf Ziya Ayık, Ferhat Kahveci
Abstract:
Currently, database management systems have various tools such as backup and maintenance, and also provide statistical information such as resource usage and security. In terms of query performance, this paper covers query optimization, views, indexed tables, pre-computation materialized view, query performance analysis in which query plan alternatives can be created and the least costly one selected to optimize a query. Indexes and views can be created for related table columns. The literature review of this study showed that, in the course of time, despite the growing capabilities of the database management system, only database administrators are aware of the need for dealing with archival and transactional data types differently. These data may be constantly changing data used in everyday life, and also may be from the completed questionnaire whose data input was completed. For both types of data, the database uses its capabilities; but as shown in the findings section, instead of repeating similar heavy calculations which are carrying out same results with the same query over a survey results, using materialized view results can be in a more simple way. In this study, this performance difference was observed quantitatively considering the cost of the query.
Keywords: Materialized view, pre-computation, query cost, query performance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13454922 Integration of Image and Patient Data, Software and International Coding Systems for Use in a Mammography Research Project
Authors: V. Balanica, W. I. D. Rae, M. Caramihai, S. Acho, C. P. Herbst
Abstract:
Mammographic images and data analysis to facilitate modelling or computer aided diagnostic (CAD) software development should best be done using a common database that can handle various mammographic image file formats and relate these to other patient information. This would optimize the use of the data as both primary reporting and enhanced information extraction of research data could be performed from the single dataset. One desired improvement is the integration of DICOM file header information into the database, as an efficient and reliable source of supplementary patient information intrinsically available in the images. The purpose of this paper was to design a suitable database to link and integrate different types of image files and gather common information that can be further used for research purposes. An interface was developed for accessing, adding, updating, modifying and extracting data from the common database, enhancing the future possible application of the data in CAD processing. Technically, future developments envisaged include the creation of an advanced search function to selects image files based on descriptor combinations. Results can be further used for specific CAD processing and other research. Design of a user friendly configuration utility for importing of the required fields from the DICOM files must be done.Keywords: Database Integration, Mammogram Classification, Tumour Classification, Computer Aided Diagnosis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19454921 Using Spectral Vectors and M-Tree for Graph Clustering and Searching in Graph Databases of Protein Structures
Authors: Do Phuc, Nguyen Thi Kim Phung
Abstract:
In this paper, we represent protein structure by using graph. A protein structure database will become a graph database. Each graph is represented by a spectral vector. We use Jacobi rotation algorithm to calculate the eigenvalues of the normalized Laplacian representation of adjacency matrix of graph. To measure the similarity between two graphs, we calculate the Euclidean distance between two graph spectral vectors. To cluster the graphs, we use M-tree with the Euclidean distance to cluster spectral vectors. Besides, M-tree can be used for graph searching in graph database. Our proposal method was tested with graph database of 100 graphs representing 100 protein structures downloaded from Protein Data Bank (PDB) and we compare the result with the SCOP hierarchical structure.Keywords: Eigenvalues, m-tree, graph database, protein structure, spectra graph theory.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16564920 A Generic, Functionally Comprehensive Approach to Maintaining an Ontology as a Relational Database
Authors: Jennifer Leopold, Alton Coalter, Leong Lee
Abstract:
An ontology is a data model that represents a set of concepts in a given field and the relationships among those concepts. As the emphasis on achieving a semantic web continues to escalate, ontologies for all types of domains increasingly will be developed. These ontologies may become large and complex, and as their size and complexity grows, so will the need for multi-user interfaces for ontology curation. Herein a functionally comprehensive, generic approach to maintaining an ontology as a relational database is presented. Unlike many other ontology editors that utilize a database, this approach is entirely domain-generic and fully supports Webbased, collaborative editing including the designation of different levels of authorization for users.Keywords: Ontology Editor, Relational Database, CollaborativeCuration.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14464919 GIS-based Approach for Land-Use Analysis: A Case Study
Authors: M. Giannopoulou, I. Roukounis, A. Roukouni.
Abstract:
Geographical Information Systems are an integral part of planning in modern technical systems. Nowadays referred to as Spatial Decision Support Systems, as they allow synergy database management systems and models within a single user interface machine and they are important tools in spatial design for evaluating policies and programs at all levels of administration. This work refers to the creation of a Geographical Information System in the context of a broader research in the area of influence of an under construction station of the new metro in the Greek city of Thessaloniki, which included statistical and multivariate data analysis and diagrammatic representation, mapping and interpretation of the results.Keywords: Databases, Geographical information systems (GIS), Land-use planning, Metro stations
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16044918 A Strategy for Address Coding from HouseHold Registry Database
Authors: Yungchien Cheng, Chienmin Chu
Abstract:
Address Matching is an important application of Geographic Information System (GIS). Prior to Address Matching working, obtaining X,Y coordinates is necessary, which process is calling Address Geocoding. This study will illustrate the effective address geocoding process of using household registry database, and the check system for geocoded address.Keywords: GIS, Address Geocoding, HouseHold Registry Database
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14784917 A Comparative Study of GTC and PSP Algorithms for Mining Sequential Patterns Embedded in Database with Time Constraints
Authors: Safa Adi
Abstract:
This paper will consider the problem of sequential mining patterns embedded in a database by handling the time constraints as defined in the GSP algorithm (level wise algorithms). We will compare two previous approaches GTC and PSP, that resumes the general principles of GSP. Furthermore this paper will discuss PG-hybrid algorithm, that using PSP and GTC. The results show that PSP and GTC are more efficient than GSP. On the other hand, the GTC algorithm performs better than PSP. The PG-hybrid algorithm use PSP algorithm for the two first passes on the database, and GTC approach for the following scans. Experiments show that the hybrid approach is very efficient for short, frequent sequences.Keywords: Database, GTC algorithm, PSP algorithm, sequential patterns, time constraints.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6994916 Recognition and Reconstruction of Partially Occluded Objects
Authors: Michela Lecca, Stefano Messelodi
Abstract:
A new automatic system for the recognition and re¬construction of resealed and/or rotated partially occluded objects is presented. The objects to be recognized are described by 2D views and each view is occluded by several half-planes. The whole object views and their visible parts (linear cuts) are then stored in a database. To establish if a region R of an input image represents an object possibly occluded, the system generates a set of linear cuts of R and compare them with the elements in the database. Each linear cut of R is associated to the most similar database linear cut. R is recognized as an instance of the object 0 if the majority of the linear cuts of R are associated to a linear cut of views of 0. In the case of recognition, the system reconstructs the occluded part of R and determines the scale factor and the orientation in the image plane of the recognized object view. The system has been tested on two different datasets of objects, showing good performance both in terms of recognition and reconstruction accuracy.
Keywords: Occluded Object Recognition, Shape Reconstruction, Automatic Self-Adaptive Systems, Linear Cut.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12854915 Extending E-learning systems based on Clause-Rule model
Authors: Keisuke Nakamura, Kiyoshi Akama, Hiroshi Mabuchi
Abstract:
E-Learning systems are used by many learners and teachers. The developer is developing the e-Learning system. However, the developer cannot do system construction to satisfy all of users- demands. We discuss a method of constructing e-Learning systems where learners and teachers can design, try to use, and share extending system functions that they want to use; which may be nally added to the system by system managers.Keywords: Clause-Rule-Model, database-access, e-Learning, Web-Application.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16794914 Obstacle Classification Method Based On 2D LIDAR Database
Authors: Moohyun Lee, Soojung Hur, Yongwan Park
Abstract:
We propose obstacle classification method based on 2D LIDAR Database. The existing obstacle classification method based on 2D LIDAR, has an advantage in terms of accuracy and shorter calculation time. However, it was difficult to classifier the type of obstacle and therefore accurate path planning was not possible. In order to overcome this problem, a method of classifying obstacle type based on width data of obstacle was proposed. However, width data was not sufficient to improve accuracy. In this paper, database was established by width and intensity data; the first classification was processed by the width data; the second classification was processed by the intensity data; classification was processed by comparing to database; result of obstacle classification was determined by finding the one with highest similarity values. An experiment using an actual autonomous vehicle under real environment shows that calculation time declined in comparison to 3D LIDAR and it was possible to classify obstacle using single 2D LIDAR.
Keywords: Obstacle, Classification, LIDAR, Segmentation, Width, Intensity, Database.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 34454913 Applying Spanning Tree Graph Theory for Automatic Database Normalization
Authors: Chetneti Srisa-an
Abstract:
In Knowledge and Data Engineering field, relational database is the best repository to store data in a real world. It has been using around the world more than eight decades. Normalization is the most important process for the analysis and design of relational databases. It aims at creating a set of relational tables with minimum data redundancy that preserve consistency and facilitate correct insertion, deletion, and modification. Normalization is a major task in the design of relational databases. Despite its importance, very few algorithms have been developed to be used in the design of commercial automatic normalization tools. It is also rare technique to do it automatically rather manually. Moreover, for a large and complex database as of now, it make even harder to do it manually. This paper presents a new complete automated relational database normalization method. It produces the directed graph and spanning tree, first. It then proceeds with generating the 2NF, 3NF and also BCNF normal forms. The benefit of this new algorithm is that it can cope with a large set of complex function dependencies.
Keywords: Relational Database, Functional Dependency, Automatic Normalization, Primary Key, Spanning tree.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28664912 Mining Sequential Patterns Using I-PrefixSpan
Authors: Dhany Saputra, Dayang R. A. Rambli, Oi Mean Foong
Abstract:
In this paper, we propose an improvement of pattern growth-based PrefixSpan algorithm, called I-PrefixSpan. The general idea of I-PrefixSpan is to use sufficient data structure for Seq-Tree framework and separator database to reduce the execution time and memory usage. Thus, with I-PrefixSpan there is no in-memory database stored after index set is constructed. The experimental result shows that using Java 2, this method improves the speed of PrefixSpan up to almost two orders of magnitude as well as the memory usage to more than one order of magnitude.Keywords: ArrayList, ArrayIntList, minimum support, sequence database, sequential patterns.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15644911 Coloured Petri Nets Model for Web Architectures of Web and Database Servers
Authors: Nidhi Gaur, Padmaja Joshi, Vijay Jain, Rajeev Srivastava
Abstract:
Web application architecture is important to achieve the desired performance for the application. Performance analysis studies are conducted to evaluate existing or planned systems. Web applications are used by hundreds of thousands of users simultaneously, which sometimes increases the risk of server failure in real time operations. We use Coloured Petri Net (CPN), a very powerful tool for modelling dynamic behaviour of a web application system. CPNs extend the vocabulary of ordinary Petri nets and add features that make them suitable for modelling large systems. The major focus of this work is on server side of web applications. The presented work focuses on modelling restructuring aspects, with major focus on concurrency and architecture, using CPN. It also focuses on bringing out the appropriate architecture for web and database servers given the number of concurrent users.Keywords: Coloured petri nets, concurrent users, performance modelling, web application architecture.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12904910 Towards an Extended SQLf: Bipolar Query Language with Preferences
Authors: L. Ludovic, R. Daniel, S-E Tbahriti
Abstract:
Database management systems that integrate user preferences promise better solution for personalization, greater flexibility and higher quality of query responses. This paper presents a tentative work that studies and investigates approaches to express user preferences in queries. We sketch an extend capabilities of SQLf language that uses the fuzzy set theory in order to define the user preferences. For that, two essential points are considered: the first concerns the expression of user preferences in SQLf by so-called fuzzy commensurable predicates set. The second concerns the bipolar way in which these user preferences are expressed on mandatory and/or optional preferences.
Keywords: Flexible query language, relational database, userpreference.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10134909 Face Recognition Based On Vector Quantization Using Fuzzy Neuro Clustering
Authors: Elizabeth B. Varghese, M. Wilscy
Abstract:
A face recognition system is a computer application for automatically identifying or verifying a person from a digital image or a video frame. A lot of algorithms have been proposed for face recognition. Vector Quantization (VQ) based face recognition is a novel approach for face recognition. Here a new codebook generation for VQ based face recognition using Integrated Adaptive Fuzzy Clustering (IAFC) is proposed. IAFC is a fuzzy neural network which incorporates a fuzzy learning rule into a competitive neural network. The performance of proposed algorithm is demonstrated by using publicly available AT&T database, Yale database, Indian Face database and a small face database, DCSKU database created in our lab. In all the databases the proposed approach got a higher recognition rate than most of the existing methods. In terms of Equal Error Rate (ERR) also the proposed codebook is better than the existing methods.
Keywords: Face Recognition, Vector Quantization, Integrated Adaptive Fuzzy Clustering, Self Organization Map.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22414908 Enhancing Privacy-Preserving Cloud Database Querying by Preventing Brute Force Attacks
Authors: Ambika Vishal Pawar, Ajay Dani
Abstract:
Considering the complexities involved in Cloud computing, there are still plenty of issues that affect the privacy of data in cloud environment. Unless these problems get solved, we think that the problem of preserving privacy in cloud databases is still open. In tokenization and homomorphic cryptography based solutions for privacy preserving cloud database querying, there is possibility that by colluding with service provider adversary may run brute force attacks that will reveal the attribute values.
In this paper we propose a solution by defining the variant of K –means clustering algorithm that effectively detects such brute force attacks and enhances privacy of cloud database querying by preventing this attacks.
Keywords: Privacy, Database, Cloud Computing, Clustering, K-means, Cryptography.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25564907 Statistical Estimation of Spring-back Degree Using Texture Database
Authors: Takashi Sakai, Shinsaku Kikuta, Jun-ichi Koyama
Abstract:
Using a texture database, a statistical estimation of spring-back was conducted in this study on the basis of statistical analysis. Both spring-back in bending deformation and experimental data related to the crystal orientation show significant dispersion. Therefore, a probabilistic statistical approach was established for the proper quantification of these values. Correlation was examined among the parameters F(x) of spring-back, F(x) of the buildup fraction to three orientations after 92° bending, and F(x) at an as-received part on the basis of the three-parameter Weibull distribution. Consequent spring-back estimation using a texture database yielded excellent estimates compared with experimental values.
Keywords: Bending, Spring-back, Database, Crystallographic Orientation, Texture, SEM-EBSD, Weibull distribution, Statistical analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18994906 Increasing Profitability Supported by Innovative Methods and Designing Monitoring Software in Condition-Based Maintenance: A Case Study
Authors: Nasrin Farajiparvar
Abstract:
In the present article, a new method has been developed to enhance the application of equipment monitoring, which in turn results in improving condition-based maintenance economic impact in an automobile parts manufacturing factory. This study also describes how an effective software with a simple database can be utilized to achieve cost-effective improvements in maintenance performance. The most important results of this project are indicated here: 1. 63% reduction in direct and indirect maintenance costs. 2. Creating a proper database to analyse failures. 3. Creating a method to control system performance and develop it to similar systems. 4. Designing a software to analyse database and consequently create technical knowledge to face unusual condition of the system. Moreover, the results of this study have shown that the concept and philosophy of maintenance has not been understood in most Iranian industries. Thus, more investment is strongly required to improve maintenance conditions.
Keywords: Condition-based maintenance, Economic savings, Iran industries, Machine life prediction software.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15764905 Breast Cancer Survivability Prediction via Classifier Ensemble
Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia
Abstract:
This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.Keywords: Classifier ensemble, breast cancer survivability, data mining, SEER.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16714904 On the Impact of Reference Node Placement in Wireless Indoor Positioning Systems
Authors: Supattra Aomumpai, Chutima Prommak
Abstract:
This paper presents a studyof the impact of reference node locations on the accuracy of the indoor positioning systems. In particular, we analyze the localization accuracy of the RSSI database mapping techniques, deploying on the IEEE 802.15.4 wireless networks. The results show that the locations of the reference nodes used in the positioning systems affect the signal propagation characteristics in the service area. Thisin turn affects the accuracy of the wireless indoor positioning system. We found that suitable location of reference nodes could reduce the positioning error upto 35 %.Keywords: Indoor positioning systems, IEEE 802.15.4 wireless networks, Signal propagation characteristics
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14834903 Building an Integrated Relational Database from Swiss Nutrition National Survey and Swiss Health Datasets for Data Mining Purposes
Authors: Ilona Mewes, Helena Jenzer, Farshideh Einsele
Abstract:
Objective: The objective of the study was to integrate two big databases from Swiss nutrition national survey (menuCH) and Swiss health national survey 2012 for data mining purposes. Each database has a demographic base data. An integrated Swiss database is built to later discover critical food consumption patterns linked with lifestyle diseases known to be strongly tied with food consumption. Design: Swiss nutrition national survey (menuCH) with approx. 2000 respondents from two different surveys, one by Phone and the other by questionnaire along with Swiss health national survey 2012 with 21500 respondents were pre-processed, cleaned and finally integrated to a unique relational database. Results: The result of this study is an integrated relational database from the Swiss nutritional and health databases.
Keywords: Health informatics, data mining, nutritional and health databases, nutritional and chronical databases.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17254902 Information Retrieval: A Comparative Study of Textual Indexing Using an Oriented Object Database (db4o) and the Inverted File
Authors: Mohammed Erritali
Abstract:
The growth in the volume of text data such as books and articles in libraries for centuries has imposed to establish effective mechanisms to locate them. Early techniques such as abstraction, indexing and the use of classification categories have marked the birth of a new field of research called "Information Retrieval". Information Retrieval (IR) can be defined as the task of defining models and systems whose purpose is to facilitate access to a set of documents in electronic form (corpus) to allow a user to find the relevant ones for him, that is to say, the contents which matches with the information needs of the user. Most of the models of information retrieval use a specific data structure to index a corpus which is called "inverted file" or "reverse index". This inverted file collects information on all terms over the corpus documents specifying the identifiers of documents that contain the term in question, the frequency of each term in the documents of the corpus, the positions of the occurrences of the word... In this paper we use an oriented object database (db4o) instead of the inverted file, that is to say, instead to search a term in the inverted file, we will search it in the db4o database. The purpose of this work is to make a comparative study to see if the oriented object databases may be competing for the inverse index in terms of access speed and resource consumption using a large volume of data.
Keywords: Information Retrieval, indexation, oriented object database (db4o), inverted file.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17344901 Visualization and Indexing of Spectral Databases
Authors: Tibor Kulcsar, Gabor Sarossy, Gabor Bereznai, Robert Auer, Janos Abonyi
Abstract:
On-line (near infrared) spectroscopy is widely used to support the operation of complex process systems. Information extracted from spectral database can be used to estimate unmeasured product properties and monitor the operation of the process. These techniques are based on looking for similar spectra by nearest neighborhood algorithms and distance based searching methods. Search for nearest neighbors in the spectral space is an NP-hard problem, the computational complexity increases by the number of points in the discrete spectrum and the number of samples in the database. To reduce the calculation time some kind of indexing could be used. The main idea presented in this paper is to combine indexing and visualization techniques to reduce the computational requirement of estimation algorithms by providing a two dimensional indexing that can also be used to visualize the structure of the spectral database. This 2D visualization of spectral database does not only support application of distance and similarity based techniques but enables the utilization of advanced clustering and prediction algorithms based on the Delaunay tessellation of the mapped spectral space. This means the prediction has not to use the high dimension space but can be based on the mapped space too. The results illustrate that the proposed method is able to segment (cluster) spectral databases and detect outliers that are not suitable for instance based learning algorithms.
Keywords: indexing high dimensional databases, dimensional reduction, clustering, similarity, k-nn algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17694900 Extended Deductive Databases with Uncertain Information
Authors: Daniel Stamate
Abstract:
The paper presents an approach for handling uncertain information in deductive databases using multivalued logics. Uncertainty means that database facts may be assigned logical values other than the conventional ones - true and false. The logical values represent various degrees of truth, which may be combined and propagated by applying the database rules. A corresponding multivalued database semantics is defined. We show that it extends successful conventional semantics as the well-founded semantics, and has a polynomial time data complexity.Keywords: Reasoning under uncertainty, multivalued logics, deductive databases, logic programs, multivalued semantics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1347