Search results for: data grid
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7788

Search results for: data grid

7188 Analysis of Medical Data using Data Mining and Formal Concept Analysis

Authors: Anamika Gupta, Naveen Kumar, Vasudha Bhatnagar

Abstract:

This paper focuses on analyzing medical diagnostic data using classification rules in data mining and context reduction in formal concept analysis. It helps in finding redundancies among the various medical examination tests used in diagnosis of a disease. Classification rules have been derived from positive and negative association rules using the Concept lattice structure of the Formal Concept Analysis. Context reduction technique given in Formal Concept Analysis along with classification rules has been used to find redundancies among the various medical examination tests. Also it finds out whether expensive medical tests can be replaced by some cheaper tests.

Keywords: Data Mining, Formal Concept Analysis, Medical Data, Negative Classification Rules.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1727
7187 Data Transmission Reliability in Short Message Integrated Distributed Monitoring Systems

Authors: Sui Xin, Li Chunsheng, Tian Di

Abstract:

Short message integrated distributed monitoring systems (SM-DMS) are growing rapidly in wireless communication applications in various areas, such as electromagnetic field (EMF) management, wastewater monitoring, and air pollution supervision, etc. However, delay in short messages often makes the data embedded in SM-DMS transmit unreliably. Moreover, there are few regulations dealing with this problem in SMS transmission protocols. In this study, based on the analysis of the command and data requirements in the SM-DMS, we developed a processing model for the control center to solve the delay problem in data transmission. Three components of the model: the data transmission protocol, the receiving buffer pool method, and the timer mechanism were described in detail. Discussions on adjusting the threshold parameter in the timer mechanism were presented for the adaptive performance during the runtime of the SM-DMS. This model optimized the data transmission reliability in SM-DMS, and provided a supplement to the data transmission reliability protocols at the application level.

Keywords: Delay, SMS, reliability, distributed monitoringsystem (DMS), wireless communication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1701
7186 Data-organization Before Learning Multi-Entity Bayesian Networks Structure

Authors: H. Bouhamed, A. Rebai, T. Lecroq, M. Jaoua

Abstract:

The objective of our work is to develop a new approach for discovering knowledge from a large mass of data, the result of applying this approach will be an expert system that will serve as diagnostic tools of a phenomenon related to a huge information system. We first recall the general problem of learning Bayesian network structure from data and suggest a solution for optimizing the complexity by using organizational and optimization methods of data. Afterward we proposed a new heuristic of learning a Multi-Entities Bayesian Networks structures. We have applied our approach to biological facts concerning hereditary complex illnesses where the literatures in biology identify the responsible variables for those diseases. Finally we conclude on the limits arched by this work.

Keywords: Data-organization, data-optimization, automatic knowledge discovery, Multi-Entities Bayesian networks, score merging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1608
7185 Data Gathering Protocols for Wireless Sensor Networks

Authors: Dhinu Johnson, Gurdip Singh

Abstract:

Sensor network applications are often data centric and involve collecting data from a set of sensor nodes to be delivered to various consumers. Typically, nodes in a sensor network are resource-constrained, and hence the algorithms operating in these networks must be efficient. There may be several algorithms available implementing the same service, and efficient considerations may require a sensor application to choose the best suited algorithm. In this paper, we present a systematic evaluation of a set of algorithms implementing the data gathering service. We propose a modular infrastructure for implementing such algorithms in TOSSIM with separate configurable modules for various tasks such as interest propagation, data propagation, aggregation, and path maintenance. By appropriately configuring these modules, we propose a number of data gathering algorithms, each of which incorporates a different set of heuristics for optimizing performance. We have performed comprehensive experiments to evaluate the effectiveness of these heuristics, and we present results from our experimentation efforts.

Keywords: Data Centric Protocols, Shortest Paths, Sensor networks, Message passing systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1435
7184 Measured versus Default Interstate Traffic Data in New Mexico, USA

Authors: M. A. Hasan, M. R. Islam, R. A. Tarefder

Abstract:

This study investigates how the site specific traffic data differs from the Mechanistic Empirical Pavement Design Software default values. Two Weigh-in-Motion (WIM) stations were installed in Interstate-40 (I-40) and Interstate-25 (I-25) to developed site specific data. A computer program named WIM Data Analysis Software (WIMDAS) was developed using Microsoft C-Sharp (.Net) for quality checking and processing of raw WIM data. A complete year data from November 2013 to October 2014 was analyzed using the developed WIM Data Analysis Program. After that, the vehicle class distribution, directional distribution, lane distribution, monthly adjustment factor, hourly distribution, axle load spectra, average number of axle per vehicle, axle spacing, lateral wander distribution, and wheelbase distribution were calculated. Then a comparative study was done between measured data and AASHTOWare default values. It was found that the measured general traffic inputs for I-40 and I-25 significantly differ from the default values.

Keywords: AASHTOWare, Traffic, Weigh-in-Motion, Axle load Distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1690
7183 Energy Efficient In-Network Data Processing in Sensor Networks

Authors: Prakash G L, Thejaswini M, S H Manjula, K R Venugopal, L M Patnaik

Abstract:

The Sensor Network consists of densely deployed sensor nodes. Energy optimization is one of the most important aspects of sensor application design. Data acquisition and aggregation techniques for processing data in-network should be energy efficient. Due to the cross-layer design, resource-limited and noisy nature of Wireless Sensor Networks(WSNs), it is challenging to study the performance of these systems in a realistic setting. In this paper, we propose optimizing queries by aggregation of data and data redundancy to reduce energy consumption without requiring all sensed data and directed diffusion communication paradigm to achieve power savings, robust communication and processing data in-network. To estimate the per-node power consumption POWERTossim mica2 energy model is used, which provides scalable and accurate results. The performance analysis shows that the proposed methods overcomes the existing methods in the aspects of energy consumption in wireless sensor networks.

Keywords: Data Aggregation, Directed Diffusion, Partial Aggregation, Packet Merging, Query Plan.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1826
7182 Preliminary Analysis of Energy Efficiency in Data Center: Case Study

Authors: Xiaoshu Lu, Tao Lu, Matias Remes, Martti Viljanen

Abstract:

As the data-driven economy is growing faster than ever and the demand for energy is being spurred, we are facing unprecedented challenges of improving energy efficiency in data centers. Effectively maximizing energy efficiency or minimising the cooling energy demand is becoming pervasive for data centers. This paper investigates overall energy consumption and the energy efficiency of cooling system for a data center in Finland as a case study. The power, cooling and energy consumption characteristics and operation condition of facilities are examined and analysed. Potential energy and cooling saving opportunities are identified and further suggestions for improving the performance of cooling system are put forward. Results are presented as a comprehensive evaluation of both the energy performance and good practices of energy efficient cooling operations for the data center. Utilization of an energy recovery concept for cooling system is proposed. The conclusion we can draw is that even though the analysed data center demonstrated relatively high energy efficiency, based on its power usage effectiveness value, there is still a significant potential for energy saving from its cooling systems.

Keywords: Data center, case study, cooling system, energyefficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1536
7181 Multidimensional Visualization Tools for Analysis of Expression Data

Authors: Urska Cvek, Marjan Trutschl, Randolph Stone II, Zanobia Syed, John L. Clifford, Anita L. Sabichi

Abstract:

Expression data analysis is based mostly on the statistical approaches that are indispensable for the study of biological systems. Large amounts of multidimensional data resulting from the high-throughput technologies are not completely served by biostatistical techniques and are usually complemented with visual, knowledge discovery and other computational tools. In many cases, in biological systems we only speculate on the processes that are causing the changes, and it is the visual explorative analysis of data during which a hypothesis is formed. We would like to show the usability of multidimensional visualization tools and promote their use in life sciences. We survey and show some of the multidimensional visualization tools in the process of data exploration, such as parallel coordinates and radviz and we extend them by combining them with the self-organizing map algorithm. We use a time course data set of transitional cell carcinoma of the bladder in our examples. Analysis of data with these tools has the potential to uncover additional relationships and non-trivial structures.

Keywords: microarrays, visualization, parallel coordinates, radviz, self-organizing maps.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2502
7180 Critical Assessment of Scoring Schemes for Protein-Protein Docking Predictions

Authors: Dhananjay C. Joshi, Jung-Hsin Lin

Abstract:

Protein-protein interactions (PPI) play a crucial role in many biological processes such as cell signalling, transcription, translation, replication, signal transduction, and drug targeting, etc. Structural information about protein-protein interaction is essential for understanding the molecular mechanisms of these processes. Structures of protein-protein complexes are still difficult to obtain by biophysical methods such as NMR and X-ray crystallography, and therefore protein-protein docking computation is considered an important approach for understanding protein-protein interactions. However, reliable prediction of the protein-protein complexes is still under way. In the past decades, several grid-based docking algorithms based on the Katchalski-Katzir scoring scheme were developed, e.g., FTDock, ZDOCK, HADDOCK, RosettaDock, HEX, etc. However, the success rate of protein-protein docking prediction is still far from ideal. In this work, we first propose a more practical measure for evaluating the success of protein-protein docking predictions,the rate of first success (RFS), which is similar to the concept of mean first passage time (MFPT). Accordingly, we have assessed the ZDOCK bound and unbound benchmarks 2.0 and 3.0. We also createda new benchmark set for protein-protein docking predictions, in which the complexes have experimentally determined binding affinity data. We performed free energy calculation based on the solution of non-linear Poisson-Boltzmann equation (nlPBE) to improve the binding mode prediction. We used the well-studied thebarnase-barstarsystem to validate the parameters for free energy calculations. Besides,thenlPBE-based free energy calculations were conducted for the badly predicted cases by ZDOCK and ZRANK. We found that direct molecular mechanics energetics cannot be used to discriminate the native binding pose from the decoys.Our results indicate that nlPBE-based calculations appeared to be one of the promising approaches for improving the success rate of binding pose predictions.

Keywords: protein-protein docking, protein-protein interaction, molecular mechanics energetics, Poisson-Boltzmann calculations

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1794
7179 A Multi-Agent Framework for Data Mining

Authors: Kamal Ali Albashiri, Khaled Ahmed Kadouh

Abstract:

A generic and extendible Multi-Agent Data Mining (MADM) framework, MADMF (the Multi-Agent Data Mining Framework) is described. The central feature of the framework is that it avoids the use of agreed meta-language formats by supporting a framework of wrappers. The advantage offered is that the framework is easily extendible, so that further data agents and mining agents can simply be added to the framework. A demonstration MADMF framework is currently available. The paper includes details of the MADMF architecture and the wrapper principle incorporated into it. A full description and evaluation of the framework-s operation is provided by considering two MADM scenarios.

Keywords: Multi-Agent Data Mining (MADM), Frequent Itemsets, Meta ARM, Association Rule Mining, Classifier generator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2068
7178 The Relevance of Data Warehousing and Data Mining in the Field of Evidence-based Medicine to Support Healthcare Decision Making

Authors: Nevena Stolba, A Min Tjoa

Abstract:

Evidence-based medicine is a new direction in modern healthcare. Its task is to prevent, diagnose and medicate diseases using medical evidence. Medical data about a large patient population is analyzed to perform healthcare management and medical research. In order to obtain the best evidence for a given disease, external clinical expertise as well as internal clinical experience must be available to the healthcare practitioners at right time and in the right manner. External evidence-based knowledge can not be applied directly to the patient without adjusting it to the patient-s health condition. We propose a data warehouse based approach as a suitable solution for the integration of external evidence-based data sources into the existing clinical information system and data mining techniques for finding appropriate therapy for a given patient and a given disease. Through integration of data warehousing, OLAP and data mining techniques in the healthcare area, an easy to use decision support platform, which supports decision making process of care givers and clinical managers, is built. We present three case studies, which show, that a clinical data warehouse that facilitates evidence-based medicine is a reliable, powerful and user-friendly platform for strategic decision making, which has a great relevance for the practice and acceptance of evidence-based medicine.

Keywords: data mining, data warehousing, decision-support systems, evidence-based medicine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3801
7177 Influence of the Line Parameters in Transmission Line Fault Location

Authors: Marian Dragomir, Alin Dragomir

Abstract:

In the paper, two fault location algorithms are presented for transmission lines which use the line parameters to estimate the distance to the fault. The first algorithm uses only the measurements from one end of the line and the positive and zero sequence parameters of the line, while the second one uses the measurements from both ends of the line and only the positive sequence parameters of the line. The algorithms were tested using a transmission grid transposed in MATLAB. In a first stage it was established a fault location base line, where the algorithms mentioned above estimate the fault locations using the exact line parameters. After that, the positive and zero sequence resistance and reactance of the line were calculated again for different ground resistivity values and then the fault locations were estimated again in order to compare the results with the base line results. The results show that the algorithm which uses the zero sequence impedance of the line is the most sensitive to the line parameters modifications. The other algorithm is less sensitive to the line parameters modification.

Keywords: Estimation algorithms, fault location, line parameters, simulation tool.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1148
7176 Investigation on Performance of Change Point Algorithm in Time Series Dynamical Regimes and Effect of Data Characteristics

Authors: Farhad Asadi, Mohammad Javad Mollakazemi

Abstract:

In this paper, Bayesian online inference in models of data series are constructed by change-points algorithm, which separated the observed time series into independent series and study the change and variation of the regime of the data with related statistical characteristics. variation of statistical characteristics of time series data often represent separated phenomena in the some dynamical system, like a change in state of brain dynamical reflected in EEG signal data measurement or a change in important regime of data in many dynamical system. In this paper, prediction algorithm for studying change point location in some time series data is simulated. It is verified that pattern of proposed distribution of data has important factor on simpler and smother fluctuation of hazard rate parameter and also for better identification of change point locations. Finally, the conditions of how the time series distribution effect on factors in this approach are explained and validated with different time series databases for some dynamical system.

Keywords: Time series, fluctuation in statistical characteristics, optimal learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806
7175 AudioMine: Medical Data Mining in Heterogeneous Audiology Records

Authors: Shaun Cox, Michael Oakes, Stefan Wermter, Maurice Hawthorne

Abstract:

We report on the results of a pilot study in which a data-mining tool was developed for mining audiology records. The records were heterogeneous in that they contained numeric, category and textual data. The tools developed are designed to observe associations between any field in the records and any other field. The techniques employed were the statistical chi-squared test, and the use of self-organizing maps, an unsupervised neural learning approach.

Keywords: Audiology, data mining, chi-squared, self-organizing maps

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1663
7174 Hospital Based Electrocardiogram Sensor Grid

Authors: Suken Nayak, Aditya Kambli, Bharati Ingale, Gauri Shukla

Abstract:

The technological concepts such as wireless hospital and portable cardiac telemetry system require the development of physiological signal acquisition devices to be easily integrated into the hospital database. In this paper we present the low cost, portable wireless ECG acquisition hardware that transmits ECG signals to a dedicated computer.The front end of the system obtains and processes incoming signals, which are then transmitted via a microcontroller and wireless Bluetooth module. A monitoring purpose Bluetooth based end user application integrated with patient database management module is developed for the computers. The system will act as a continuous event recorder, which can be used to follow up patients who have been resuscitatedfrom cardiac arrest, ventricular tachycardia but also for diagnostic purposes for patients with arrhythmia symptoms. In addition, cardiac information can be saved into the patient-s database of the hospital.

Keywords: ECG, Bluetooth communication, monitoring application, patient database

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2126
7173 Advanced Pulse Width Modulation Techniques for Z Source Multi Level Inverter

Authors: B. M. Manjunatha, D. V. Ashok Kumar, M. Vijay Kumar

Abstract:

This paper proposes five level diode clamped Z source Inverter. The existing PWM techniques used for ZSI are restricted for two level. The two level Z Source Inverter have high harmonic distortions which effects the performance of the grid connected PV system. To improve the performance of the system the number of voltage levels in the output waveform need to be increased. This paper presents comparative analysis of a five level diode clamped Z source Inverter with different carrier based Modified Pulse Width Modulation techniques. The parameters considered for comparison are output voltage, voltage gain, voltage stress across switch and total harmonic distortion when powered by same DC supply. Analytical results are verified using MATLAB.

Keywords: Diode Clamped, Pulse Width Modulation, total harmonic distortion, Z Source Inverter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2353
7172 Wireless Distributed Load-Shedding Management System for Non-Emergency Cases

Authors: Taha Landolsi, A. R. Al-Ali, Tarik Ozkul, Mohammad A. Al-Rousan

Abstract:

In this paper, we present a cost-effective wireless distributed load shedding system for non-emergency scenarios. In power transformer locations where SCADA system cannot be used, the proposed solution provides a reasonable alternative that combines the use of microcontrollers and existing GSM infrastructure to send early warning SMS messages to users advising them to proactively reduce their power consumption before system capacity is reached and systematic power shutdown takes place. A novel communication protocol and message set have been devised to handle the messaging between the transformer sites, where the microcontrollers are located and where the measurements take place, and the central processing site where the database server is hosted. Moreover, the system sends warning messages to the endusers mobile devices that are used as communication terminals. The system has been implemented and tested via different experimental results.

Keywords: Smart Grid, Load shedding, Demand SideManagement, GSM Wireless Networks, SCADA systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2622
7171 Fuzzy Types Clustering for Microarray Data

Authors: Seo Young Kim, Tai Myong Choi

Abstract:

The main goal of microarray experiments is to quantify the expression of every object on a slide as precisely as possible, with a further goal of clustering the objects. Recently, many studies have discussed clustering issues involving similar patterns of gene expression. This paper presents an application of fuzzy-type methods for clustering DNA microarray data that can be applied to typical comparisons. Clustering and analyses were performed on microarray and simulated data. The results show that fuzzy-possibility c-means clustering substantially improves the findings obtained by others.

Keywords: Clustering, microarray data, Fuzzy-type clustering, Validation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1513
7170 Robust Regression and its Application in Financial Data Analysis

Authors: Mansoor Momeni, Mahmoud Dehghan Nayeri, Ali Faal Ghayoumi, Hoda Ghorbani

Abstract:

This research is aimed to describe the application of robust regression and its advantages over the least square regression method in analyzing financial data. To do this, relationship between earning per share, book value of equity per share and share price as price model and earning per share, annual change of earning per share and return of stock as return model is discussed using both robust and least square regressions, and finally the outcomes are compared. Comparing the results from the robust regression and the least square regression shows that the former can provide the possibility of a better and more realistic analysis owing to eliminating or reducing the contribution of outliers and influential data. Therefore, robust regression is recommended for getting more precise results in financial data analysis.

Keywords: Financial data analysis, Influential data, Outliers, Robust regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1928
7169 Estimation of PM2.5 Emissions and Source Apportionment Using Receptor and Dispersion Models

Authors: Swetha Priya Darshini Thammadi, Sateesh Kumar Pisini, Sanjay Kumar Shukla

Abstract:

Source apportionment using Dispersion model depends primarily on the quality of Emission Inventory. In the present study, a CMB receptor model has been used to identify the sources of PM2.5, while the AERMOD dispersion model has been used to account for missing sources of PM2.5 in the Emission Inventory. A statistical approach has been developed to quantify the missing sources not considered in the Emission Inventory. The inventory of each grid was improved by adjusting emissions based on road lengths and deficit in measured and modelled concentrations. The results showed that in CMB analyses, fugitive sources - soil and road dust - contribute significantly to ambient PM2.5 pollution. As a result, AERMOD significantly underestimated the ambient air concentration at most locations. The revised Emission Inventory showed a significant improvement in AERMOD performance which is evident through statistical tests.

Keywords: CMB, GIS, AERMOD, PM2.5, fugitive, emission inventory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 888
7168 Fuzzy Based Problem-Solution Data Structureas a Data Oriented Model for ABS Controlling

Authors: Ahmad Habibizad Navin, Mehdi Naghian Fesharaki, Mohamad Teshnelab, Ehsan Shahamatnia

Abstract:

The anti-lock braking systems installed on vehicles for safe and effective braking, are high-order nonlinear and timevariant. Using fuzzy logic controllers increase efficiency of such systems, but impose a high computational complexity as well. The main concept introduced by this paper is reducing computational complexity of fuzzy controllers by deploying problem-solution data structure. Unlike conventional methods that are based on calculations, this approach is based on data oriented modeling.

Keywords: ABS, Fuzzy controller, PSDS, Time-Memory tradeoff, Data oriented modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1729
7167 Street Network in Bandung City, Indonesia: Comparison between City Center and New Commercial Area

Authors: Siska Soesanti, Norihiro Nakai

Abstract:

Bandung city center can be deemed as economic, social and cultural center. However the city center suffers from deterioration. The retail activities tend to shift outward the city center. Numerous idyllic residences changed into business premises in two villages situated in the north part of the city during 1990s, especially after a new highway and flyover opened. According to space syntax theory, the pattern of spatial integration in the urban grid is a prime determinant of movement patterns in the system. The syntactic analysis results show the flyover has insignificant influence on street network in the city center. However the flyover has been generating a major difference in the new commercial area since it has become relatively as strategic as the city center. Besides street network, local government policy, rapid private motorization and particular condition of each site also played important roles in encouraging the current commercial areas to flourish.

Keywords: City center, commercial area, space syntax, street network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1812
7166 Methods and Algorithms of Ensuring Data Privacy in AI-Based Healthcare Systems and Technologies

Authors: Omar Farshad Jeelani, Makaire Njie, Viktoriia M. Korzhuk

Abstract:

Recently, the application of AI-powered algorithms in healthcare continues to flourish. Particularly, access to healthcare information, including patient health history, diagnostic data, and PII (Personally Identifiable Information) is paramount in the delivery of efficient patient outcomes. However, as the exchange of healthcare information between patients and healthcare providers through AI-powered solutions increases, protecting a person’s information and their privacy has become even more important. Arguably, the increased adoption of healthcare AI has resulted in a significant concentration on the security risks and protection measures to the security and privacy of healthcare data, leading to escalated analyses and enforcement. Since these challenges are brought by the use of AI-based healthcare solutions to manage healthcare data, AI-based data protection measures are used to resolve the underlying problems. Consequently, these projects propose AI-powered safeguards and policies/laws to protect the privacy of healthcare data. The project present the best-in-school techniques used to preserve data privacy of AI-powered healthcare applications. Popular privacy-protecting methods like Federated learning, cryptography techniques, differential privacy methods, and hybrid methods are discussed together with potential cyber threats, data security concerns, and prospects. Also, the project discusses some of the relevant data security acts/laws that govern the collection, storage, and processing of healthcare data to guarantee owners’ privacy is preserved. This inquiry discusses various gaps and uncertainties associated with healthcare AI data collection procedures, and identifies potential correction/mitigation measures.

Keywords: Data privacy, artificial intelligence, healthcare AI, data sharing, healthcare organizations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 88
7165 Use of Bayesian Network in Information Extraction from Unstructured Data Sources

Authors: Quratulain N. Rajput, Sajjad Haider

Abstract:

This paper applies Bayesian Networks to support information extraction from unstructured, ungrammatical, and incoherent data sources for semantic annotation. A tool has been developed that combines ontologies, machine learning, and information extraction and probabilistic reasoning techniques to support the extraction process. Data acquisition is performed with the aid of knowledge specified in the form of ontology. Due to the variable size of information available on different data sources, it is often the case that the extracted data contains missing values for certain variables of interest. It is desirable in such situations to predict the missing values. The methodology, presented in this paper, first learns a Bayesian network from the training data and then uses it to predict missing data and to resolve conflicts. Experiments have been conducted to analyze the performance of the presented methodology. The results look promising as the methodology achieves high degree of precision and recall for information extraction and reasonably good accuracy for predicting missing values.

Keywords: Information Extraction, Bayesian Network, ontology, Machine Learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2223
7164 Techno-Economic Analysis of Motor-Generator Pair System and Virtual Synchronous Generator for Providing Inertia of Power System

Authors: Zhou Yingkun, Xu Guorui, Wei Siming, Huang Yongzhang

Abstract:

With the increasing of the penetration of renewable energy in power system, the whole inertia of the power system is declining, which will endanger the frequency stability of the power system. In order to enhance the inertia, virtual synchronous generator (VSG) has been proposed. In addition, the motor-generator pair (MGP) system is proposed to enhance grid inertia. Both of them need additional equipment to provide instantaneous energy, so the economic problem should be considered. In this paper, the basic working principle of MGP system and VSG are introduced firstly. Then, the technical characteristics and economic investment of MGP/VSG are compared by calculation and simulation. The results show that the MGP system can provide same inertia with less cost than VSG.

Keywords: High renewable energy penetration, inertia of power system, virtual synchronous generator, motor-generator pair system, techno-economic analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1249
7163 Data Acquisition from Cell Phone using Logical Approach

Authors: Keonwoo Kim, Dowon Hong, Kyoil Chung, Jae-Cheol Ryou

Abstract:

Cell phone forensics to acquire and analyze data in the cellular phone is nowadays being used in a national investigation organization and a private company. In order to collect cellular phone flash memory data, we have two methods. Firstly, it is a logical method which acquires files and directories from the file system of the cell phone flash memory. Secondly, we can get all data from bit-by-bit copy of entire physical memory using a low level access method. In this paper, we describe a forensic tool to acquire cell phone flash memory data using a logical level approach. By our tool, we can get EFS file system and peek memory data with an arbitrary region from Korea CDMA cell phone.

Keywords: Forensics, logical method, acquisition, cell phone, flash memory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4110
7162 Data Migration Methodology from Relational to NoSQL Databases

Authors: Mohamed Hanine, Abdesadik Bendarag, Omar Boutkhoum

Abstract:

Currently, the field of data migration is very topical. As the number of applications developed rapidly, the ever-increasing volume of data collected has driven the architectural migration from Relational Database Management System (RDBMS) to NoSQL (Not Only SQL) database. This very recent technology is important enough in the field of database management. The main aim of this paper is to present a methodology for data migration from RDBMS to NoSQL database. To illustrate this methodology, we implement a software prototype using MySQL as a RDBMS and MongoDB as a NoSQL database. Although this is a hard engineering work, our results show that the proposed methodology can successfully accomplish the goal of this study.

Keywords: Data Migration, MySQL, RDBMS, NoSQL, MongoDB.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4354
7161 GridNtru: High Performance PKCS

Authors: Narasimham Challa, Jayaram Pradhan

Abstract:

Cryptographic algorithms play a crucial role in the information society by providing protection from unauthorized access to sensitive data. It is clear that information technology will become increasingly pervasive, Hence we can expect the emergence of ubiquitous or pervasive computing, ambient intelligence. These new environments and applications will present new security challenges, and there is no doubt that cryptographic algorithms and protocols will form a part of the solution. The efficiency of a public key cryptosystem is mainly measured in computational overheads, key size and bandwidth. In particular the RSA algorithm is used in many applications for providing the security. Although the security of RSA is beyond doubt, the evolution in computing power has caused a growth in the necessary key length. The fact that most chips on smart cards can-t process key extending 1024 bit shows that there is need for alternative. NTRU is such an alternative and it is a collection of mathematical algorithm based on manipulating lists of very small integers and polynomials. This allows NTRU to high speeds with the use of minimal computing power. NTRU (Nth degree Truncated Polynomial Ring Unit) is the first secure public key cryptosystem not based on factorization or discrete logarithm problem. This means that given sufficient computational resources and time, an adversary, should not be able to break the key. The multi-party communication and requirement of optimal resource utilization necessitated the need for the present day demand of applications that need security enforcement technique .and can be enhanced with high-end computing. This has promoted us to develop high-performance NTRU schemes using approaches such as the use of high-end computing hardware. Peer-to-peer (P2P) or enterprise grids are proven as one of the approaches for developing high-end computing systems. By utilizing them one can improve the performance of NTRU through parallel execution. In this paper we propose and develop an application for NTRU using enterprise grid middleware called Alchemi. An analysis and comparison of its performance for various text files is presented.

Keywords: Alchemi, GridNtru, Ntru, PKCS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1681
7160 Performance Comparison of Particle Swarm Optimization with Traditional Clustering Algorithms used in Self-Organizing Map

Authors: Anurag Sharma, Christian W. Omlin

Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. It can reveal structure in data sets through data visualization that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOM, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of an adaptive heuristic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOM. The application of our method to several standard data sets demonstrates its feasibility. PSO algorithm utilizes a so-called U-matrix of SOM to determine cluster boundaries; the results of this novel automatic method compare very favorably to boundary detection through traditional algorithms namely k-means and hierarchical based approach which are normally used to interpret the output of SOM.

Keywords: cluster boundaries, clustering, code vectors, data mining, particle swarm optimization, self-organizing maps, U-matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1903
7159 Data Hiding by Vector Quantization in Color Image

Authors: Yung-Gi Wu

Abstract:

With the growing of computer and network, digital data can be spread to anywhere in the world quickly. In addition, digital data can also be copied or tampered easily so that the security issue becomes an important topic in the protection of digital data. Digital watermark is a method to protect the ownership of digital data. Embedding the watermark will influence the quality certainly. In this paper, Vector Quantization (VQ) is used to embed the watermark into the image to fulfill the goal of data hiding. This kind of watermarking is invisible which means that the users will not conscious the existing of embedded watermark even though the embedded image has tiny difference compared to the original image. Meanwhile, VQ needs a lot of computation burden so that we adopt a fast VQ encoding scheme by partial distortion searching (PDS) and mean approximation scheme to speed up the data hiding process. The watermarks we hide to the image could be gray, bi-level and color images. Texts are also can be regarded as watermark to embed. In order to test the robustness of the system, we adopt Photoshop to fulfill sharpen, cropping and altering to check if the extracted watermark is still recognizable. Experimental results demonstrate that the proposed system can resist the above three kinds of tampering in general cases.

Keywords: Data hiding, vector quantization, watermark.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1771