Search results for: web data regions
7479 Approaches and Schemes for Storing DTD-Independent XML Data in Relational Databases
Authors: Mehdi Emadi, Masoud Rahgozar, Adel Ardalan, Alireza Kazerani, Mohammad Mahdi Ariyan
Abstract:
The volume of XML data exchange is explosively increasing, and the need for efficient mechanisms of XML data management is vital. Many XML storage models have been proposed for storing XML DTD-independent documents in relational database systems. Benchmarking is the best way to highlight pros and cons of different approaches. In this study, we use a common benchmarking scheme, known as XMark to compare the most cited and newly proposed DTD-independent methods in terms of logical reads, physical I/O, CPU time and duration. We show the effect of Label Path, extracting values and storing in another table and type of join needed for each method's query answering.Keywords: XML Data Management, XPath, DTD-IndependentXML Data
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18817478 Approaches and Schemes for Storing DTDIndependent XML Data in Relational Databases
Authors: Mehdi Emadi, Masoud Rahgozar, Adel Ardalan, Alireza Kazerani, Mohammad Mahdi Ariyan
Abstract:
The volume of XML data exchange is explosively increasing, and the need for efficient mechanisms of XML data management is vital. Many XML storage models have been proposed for storing XML DTD-independent documents in relational database systems. Benchmarking is the best way to highlight pros and cons of different approaches. In this study, we use a common benchmarking scheme, known as XMark to compare the most cited and newly proposed DTD-independent methods in terms of logical reads, physical I/O, CPU time and duration. We show the effect of Label Path, extracting values and storing in another table and type of join needed for each method-s query answering.Keywords: XML Data Management, XPath, DTD-Independent XML Data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14537477 Analysis of Urban Population Using Twitter Distribution Data: Case Study of Makassar City, Indonesia
Authors: Yuyun Wabula, B. J. Dewancker
Abstract:
In the past decade, the social networking app has been growing very rapidly. Geolocation data is one of the important features of social media that can attach the user's location coordinate in the real world. This paper proposes the use of geolocation data from the Twitter social media application to gain knowledge about urban dynamics, especially on human mobility behavior. This paper aims to explore the relation between geolocation Twitter with the existence of people in the urban area. Firstly, the study will analyze the spread of people in the particular area, within the city using Twitter social media data. Secondly, we then match and categorize the existing place based on the same individuals visiting. Then, we combine the Twitter data from the tracking result and the questionnaire data to catch the Twitter user profile. To do that, we used the distribution frequency analysis to learn the visitors’ percentage. To validate the hypothesis, we compare it with the local population statistic data and land use mapping released by the city planning department of Makassar local government. The results show that there is the correlation between Twitter geolocation and questionnaire data. Thus, integration the Twitter data and survey data can reveal the profile of the social media users.
Keywords: Geolocation, Twitter, distribution analysis, human mobility.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11927476 Database Compression for Intelligent On-board Vehicle Controllers
Authors: Ágoston Winkler, Sándor Juhász, Zoltán Benedek
Abstract:
The vehicle fleet of public transportation companies is often equipped with intelligent on-board passenger information systems. A frequently used but time and labor-intensive way for keeping the on-board controllers up-to-date is the manual update using different memory cards (e.g. flash cards) or portable computers. This paper describes a compression algorithm that enables data transmission using low bandwidth wireless radio networks (e.g. GPRS) by minimizing the amount of data traffic. In typical cases it reaches a compression rate of an order of magnitude better than that of the general purpose compressors. Compressed data can be easily expanded by the low-performance controllers, too.
Keywords: Data analysis, data compression, differentialencoding, run-length encoding, vehicle control.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15677475 EUDIS-An Encryption Scheme for User-Data Security in Public Networks
Authors: S. Balaji, M. Rajaram
Abstract:
The method of introducing the proxy interpretation for sending and receiving requests increase the capability of the server and our approach UDIV (User-Data Identity Security) to solve the data and user authentication without extending size of the data makes better than hybrid IDS (Intrusion Detection System). And at the same time all the security stages we have framed have to pass through less through that minimize the response time of the request. Even though an anomaly detected, before rejecting it the proxy extracts its identity to prevent it to enter into system. In case of false anomalies, the request will be reshaped and transformed into legitimate request for further response. Finally we are holding the normal and abnormal requests in two different queues with own priorities.
Keywords: IDS, Data & User authentication, UDIS.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18547474 An Analysis of Genetic Algorithm Based Test Data Compression Using Modified PRL Coding
Authors: K. S. Neelukumari, K. B. Jayanthi
Abstract:
In this paper genetic based test data compression is targeted for improving the compression ratio and for reducing the computation time. The genetic algorithm is based on extended pattern run-length coding. The test set contains a large number of X value that can be effectively exploited to improve the test data compression. In this coding method, a reference pattern is set and its compatibility is checked. For this process, a genetic algorithm is proposed to reduce the computation time of encoding algorithm. This coding technique encodes the 2n compatible pattern or the inversely compatible pattern into a single test data segment or multiple test data segment. The experimental result shows that the compression ratio and computation time is reduced.Keywords: Backtracking, test data compression (TDC), x-filling, x-propagating and genetic algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18697473 Parallel Vector Processing Using Multi Level Orbital DATA
Authors: Nagi Mekhiel
Abstract:
Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.Keywords: Memory organization, parallel processors, serial code, vector processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10627472 Identifying Knowledge Gaps in Incorporating Toxicity of Particulate Matter Constituents for Developing Regulatory Limits on Particulate Matter
Authors: Ananya Das, Arun Kumar, Gazala Habib, Vivekanandan Perumal
Abstract:
Regulatory bodies has proposed limits on Particulate Matter (PM) concentration in air; however, it does not explicitly indicate the incorporation of effects of toxicities of constituents of PM in developing regulatory limits. This study aimed to provide a structured approach to incorporate toxic effects of components in developing regulatory limits on PM. A four-step human health risk assessment framework consists of - (1) hazard identification (parameters: PM and its constituents and their associated toxic effects on health), (2) exposure assessment (parameters: concentrations of PM and constituents, information on size and shape of PM; fate and transport of PM and constituents in respiratory system), (3) dose-response assessment (parameters: reference dose or target toxicity dose of PM and its constituents), and (4) risk estimation (metric: hazard quotient and/or lifetime incremental risk of cancer as applicable). Then parameters required at every step were obtained from literature. Using this information, an attempt has been made to determine limits on PM using component-specific information. An example calculation was conducted for exposures of PM2.5 and its metal constituents from Indian ambient environment to determine limit on PM values. Identified data gaps were: (1) concentrations of PM and its constituents and their relationship with sampling regions, (2) relationship of toxicity of PM with its components.Keywords: Air, component-specific toxicity, human health risks, particulate matter.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11887471 Implementation of Congestion Management Strategies on Arterial Roads: Case Study of Geelong
Authors: A. Das, L. Hitihamillage, S. Moridpour
Abstract:
Natural disasters are inevitable to the biodiversity. Disasters such as flood, tsunami and tornadoes could be brutal, harsh and devastating. In Australia, flooding is a major issue experienced by different parts of the country. In such crisis, delays in evacuation could decide the life and death of the people living in those regions. Congestion management could become a mammoth task if there are no steps taken before such situations. In the past to manage congestion in such circumstances, many strategies were utilised such as converting the road shoulders to extra lanes or changing the road geometry by adding more lanes. However, expansion of road to resolving congestion problems is not considered a viable option nowadays. The authorities avoid this option due to many reasons, such as lack of financial support and land space. They tend to focus their attention on optimising the current resources they possess and use traffic signals to overcome congestion problems. Traffic Signal Management strategy was considered a viable option, to alleviate congestion problems in the City of Geelong, Victoria. Arterial road with signalised intersections considered in this paper and the traffic data required for modelling collected from VicRoads. Traffic signalling software SIDRA used to model the roads, and the information gathered from VicRoads. In this paper, various signal parameters utilised to assess and improve the corridor performance to achieve the best possible Level of Services (LOS) for the arterial road.
Keywords: Congestion, constraints, management, LOS.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9637470 An Investigation into the Application of Artificial Neural Networks to the Prediction of Injuries in Sport
Authors: J. McCullagh, T. Whitfort
Abstract:
Artificial Neural Networks (ANNs) have been used successfully in many scientific, industrial and business domains as a method for extracting knowledge from vast amounts of data. However the use of ANN techniques in the sporting domain has been limited. In professional sport, data is stored on many aspects of teams, games, training and players. Sporting organisations have begun to realise that there is a wealth of untapped knowledge contained in the data and there is great interest in techniques to utilise this data. This study will use player data from the elite Australian Football League (AFL) competition to train and test ANNs with the aim to predict the onset of injuries. The results demonstrate that an accuracy of 82.9% was achieved by the ANNs’ predictions across all examples with 94.5% of all injuries correctly predicted. These initial findings suggest that ANNs may have the potential to assist sporting clubs in the prediction of injuries.Keywords: Artificial Neural Networks, data, injuries, sport
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28897469 Design Considerations of PV Water Pumping and Rural Electricity System (2011) in Lower Myanmar
Authors: Nang Saw Yuzana Kya ing, Wunna Swe
Abstract:
Photovoltaic (PV) systems provides a viable means of power generation for applications like powering residential appliances, electrification of villages in rural areas, refrigeration and water pumping. Photovoltaic-power generation is reliable. The operation and maintenance costs are very low. Since Myanmar is a land of plentiful sunshine, especially in central and southern regions of the country, the solar energy could hopefully become the final solution to its energy supply problem in rural area.Keywords: Myanmar, Standalone PV Inverter, PV WaterPumping, Design Analysis, Induction Motor Driving System
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25047468 Analysis of Medical Data using Data Mining and Formal Concept Analysis
Authors: Anamika Gupta, Naveen Kumar, Vasudha Bhatnagar
Abstract:
This paper focuses on analyzing medical diagnostic data using classification rules in data mining and context reduction in formal concept analysis. It helps in finding redundancies among the various medical examination tests used in diagnosis of a disease. Classification rules have been derived from positive and negative association rules using the Concept lattice structure of the Formal Concept Analysis. Context reduction technique given in Formal Concept Analysis along with classification rules has been used to find redundancies among the various medical examination tests. Also it finds out whether expensive medical tests can be replaced by some cheaper tests.
Keywords: Data Mining, Formal Concept Analysis, Medical Data, Negative Classification Rules.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17387467 Data Transmission Reliability in Short Message Integrated Distributed Monitoring Systems
Authors: Sui Xin, Li Chunsheng, Tian Di
Abstract:
Short message integrated distributed monitoring systems (SM-DMS) are growing rapidly in wireless communication applications in various areas, such as electromagnetic field (EMF) management, wastewater monitoring, and air pollution supervision, etc. However, delay in short messages often makes the data embedded in SM-DMS transmit unreliably. Moreover, there are few regulations dealing with this problem in SMS transmission protocols. In this study, based on the analysis of the command and data requirements in the SM-DMS, we developed a processing model for the control center to solve the delay problem in data transmission. Three components of the model: the data transmission protocol, the receiving buffer pool method, and the timer mechanism were described in detail. Discussions on adjusting the threshold parameter in the timer mechanism were presented for the adaptive performance during the runtime of the SM-DMS. This model optimized the data transmission reliability in SM-DMS, and provided a supplement to the data transmission reliability protocols at the application level.
Keywords: Delay, SMS, reliability, distributed monitoringsystem (DMS), wireless communication.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17047466 Data-organization Before Learning Multi-Entity Bayesian Networks Structure
Authors: H. Bouhamed, A. Rebai, T. Lecroq, M. Jaoua
Abstract:
The objective of our work is to develop a new approach for discovering knowledge from a large mass of data, the result of applying this approach will be an expert system that will serve as diagnostic tools of a phenomenon related to a huge information system. We first recall the general problem of learning Bayesian network structure from data and suggest a solution for optimizing the complexity by using organizational and optimization methods of data. Afterward we proposed a new heuristic of learning a Multi-Entities Bayesian Networks structures. We have applied our approach to biological facts concerning hereditary complex illnesses where the literatures in biology identify the responsible variables for those diseases. Finally we conclude on the limits arched by this work.
Keywords: Data-organization, data-optimization, automatic knowledge discovery, Multi-Entities Bayesian networks, score merging.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16117465 Economics of Conflict: Core Economic Dimensions of the Georgian-South Ossetian Context
Authors: V. Charaia
Abstract:
This article presents SWOT analysis for Georgian - South Ossetian conflict. The research analyzes socio-economic aspects and considers future prospects for all sides including neighbor countries and regions. Also it includes the possibilities of positive intervention of neighbor countries to solve the conflict or to mitigate its negative results. The main question of the article is: What will it take to award Georgians and South Ossetians with a peace dividend?
Keywords: Conflict economics, Georgian economy, international organizations, peace building, S. Ossetian economy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14507464 Data Gathering Protocols for Wireless Sensor Networks
Authors: Dhinu Johnson, Gurdip Singh
Abstract:
Sensor network applications are often data centric and involve collecting data from a set of sensor nodes to be delivered to various consumers. Typically, nodes in a sensor network are resource-constrained, and hence the algorithms operating in these networks must be efficient. There may be several algorithms available implementing the same service, and efficient considerations may require a sensor application to choose the best suited algorithm. In this paper, we present a systematic evaluation of a set of algorithms implementing the data gathering service. We propose a modular infrastructure for implementing such algorithms in TOSSIM with separate configurable modules for various tasks such as interest propagation, data propagation, aggregation, and path maintenance. By appropriately configuring these modules, we propose a number of data gathering algorithms, each of which incorporates a different set of heuristics for optimizing performance. We have performed comprehensive experiments to evaluate the effectiveness of these heuristics, and we present results from our experimentation efforts.Keywords: Data Centric Protocols, Shortest Paths, Sensor networks, Message passing systems.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14447463 Measured versus Default Interstate Traffic Data in New Mexico, USA
Authors: M. A. Hasan, M. R. Islam, R. A. Tarefder
Abstract:
This study investigates how the site specific traffic data differs from the Mechanistic Empirical Pavement Design Software default values. Two Weigh-in-Motion (WIM) stations were installed in Interstate-40 (I-40) and Interstate-25 (I-25) to developed site specific data. A computer program named WIM Data Analysis Software (WIMDAS) was developed using Microsoft C-Sharp (.Net) for quality checking and processing of raw WIM data. A complete year data from November 2013 to October 2014 was analyzed using the developed WIM Data Analysis Program. After that, the vehicle class distribution, directional distribution, lane distribution, monthly adjustment factor, hourly distribution, axle load spectra, average number of axle per vehicle, axle spacing, lateral wander distribution, and wheelbase distribution were calculated. Then a comparative study was done between measured data and AASHTOWare default values. It was found that the measured general traffic inputs for I-40 and I-25 significantly differ from the default values.Keywords: AASHTOWare, Traffic, Weigh-in-Motion, Axle load Distribution.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16997462 Energy Efficient In-Network Data Processing in Sensor Networks
Authors: Prakash G L, Thejaswini M, S H Manjula, K R Venugopal, L M Patnaik
Abstract:
The Sensor Network consists of densely deployed sensor nodes. Energy optimization is one of the most important aspects of sensor application design. Data acquisition and aggregation techniques for processing data in-network should be energy efficient. Due to the cross-layer design, resource-limited and noisy nature of Wireless Sensor Networks(WSNs), it is challenging to study the performance of these systems in a realistic setting. In this paper, we propose optimizing queries by aggregation of data and data redundancy to reduce energy consumption without requiring all sensed data and directed diffusion communication paradigm to achieve power savings, robust communication and processing data in-network. To estimate the per-node power consumption POWERTossim mica2 energy model is used, which provides scalable and accurate results. The performance analysis shows that the proposed methods overcomes the existing methods in the aspects of energy consumption in wireless sensor networks.Keywords: Data Aggregation, Directed Diffusion, Partial Aggregation, Packet Merging, Query Plan.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18337461 Preliminary Analysis of Energy Efficiency in Data Center: Case Study
Authors: Xiaoshu Lu, Tao Lu, Matias Remes, Martti Viljanen
Abstract:
As the data-driven economy is growing faster than ever and the demand for energy is being spurred, we are facing unprecedented challenges of improving energy efficiency in data centers. Effectively maximizing energy efficiency or minimising the cooling energy demand is becoming pervasive for data centers. This paper investigates overall energy consumption and the energy efficiency of cooling system for a data center in Finland as a case study. The power, cooling and energy consumption characteristics and operation condition of facilities are examined and analysed. Potential energy and cooling saving opportunities are identified and further suggestions for improving the performance of cooling system are put forward. Results are presented as a comprehensive evaluation of both the energy performance and good practices of energy efficient cooling operations for the data center. Utilization of an energy recovery concept for cooling system is proposed. The conclusion we can draw is that even though the analysed data center demonstrated relatively high energy efficiency, based on its power usage effectiveness value, there is still a significant potential for energy saving from its cooling systems.Keywords: Data center, case study, cooling system, energyefficiency.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15437460 Multidimensional Visualization Tools for Analysis of Expression Data
Authors: Urska Cvek, Marjan Trutschl, Randolph Stone II, Zanobia Syed, John L. Clifford, Anita L. Sabichi
Abstract:
Expression data analysis is based mostly on the statistical approaches that are indispensable for the study of biological systems. Large amounts of multidimensional data resulting from the high-throughput technologies are not completely served by biostatistical techniques and are usually complemented with visual, knowledge discovery and other computational tools. In many cases, in biological systems we only speculate on the processes that are causing the changes, and it is the visual explorative analysis of data during which a hypothesis is formed. We would like to show the usability of multidimensional visualization tools and promote their use in life sciences. We survey and show some of the multidimensional visualization tools in the process of data exploration, such as parallel coordinates and radviz and we extend them by combining them with the self-organizing map algorithm. We use a time course data set of transitional cell carcinoma of the bladder in our examples. Analysis of data with these tools has the potential to uncover additional relationships and non-trivial structures.Keywords: microarrays, visualization, parallel coordinates, radviz, self-organizing maps.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25087459 A Multi-Agent Framework for Data Mining
Authors: Kamal Ali Albashiri, Khaled Ahmed Kadouh
Abstract:
A generic and extendible Multi-Agent Data Mining (MADM) framework, MADMF (the Multi-Agent Data Mining Framework) is described. The central feature of the framework is that it avoids the use of agreed meta-language formats by supporting a framework of wrappers. The advantage offered is that the framework is easily extendible, so that further data agents and mining agents can simply be added to the framework. A demonstration MADMF framework is currently available. The paper includes details of the MADMF architecture and the wrapper principle incorporated into it. A full description and evaluation of the framework-s operation is provided by considering two MADM scenarios.Keywords: Multi-Agent Data Mining (MADM), Frequent Itemsets, Meta ARM, Association Rule Mining, Classifier generator.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20747458 The Relevance of Data Warehousing and Data Mining in the Field of Evidence-based Medicine to Support Healthcare Decision Making
Authors: Nevena Stolba, A Min Tjoa
Abstract:
Evidence-based medicine is a new direction in modern healthcare. Its task is to prevent, diagnose and medicate diseases using medical evidence. Medical data about a large patient population is analyzed to perform healthcare management and medical research. In order to obtain the best evidence for a given disease, external clinical expertise as well as internal clinical experience must be available to the healthcare practitioners at right time and in the right manner. External evidence-based knowledge can not be applied directly to the patient without adjusting it to the patient-s health condition. We propose a data warehouse based approach as a suitable solution for the integration of external evidence-based data sources into the existing clinical information system and data mining techniques for finding appropriate therapy for a given patient and a given disease. Through integration of data warehousing, OLAP and data mining techniques in the healthcare area, an easy to use decision support platform, which supports decision making process of care givers and clinical managers, is built. We present three case studies, which show, that a clinical data warehouse that facilitates evidence-based medicine is a reliable, powerful and user-friendly platform for strategic decision making, which has a great relevance for the practice and acceptance of evidence-based medicine.
Keywords: data mining, data warehousing, decision-support systems, evidence-based medicine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 38117457 Investigation on Performance of Change Point Algorithm in Time Series Dynamical Regimes and Effect of Data Characteristics
Authors: Farhad Asadi, Mohammad Javad Mollakazemi
Abstract:
In this paper, Bayesian online inference in models of data series are constructed by change-points algorithm, which separated the observed time series into independent series and study the change and variation of the regime of the data with related statistical characteristics. variation of statistical characteristics of time series data often represent separated phenomena in the some dynamical system, like a change in state of brain dynamical reflected in EEG signal data measurement or a change in important regime of data in many dynamical system. In this paper, prediction algorithm for studying change point location in some time series data is simulated. It is verified that pattern of proposed distribution of data has important factor on simpler and smother fluctuation of hazard rate parameter and also for better identification of change point locations. Finally, the conditions of how the time series distribution effect on factors in this approach are explained and validated with different time series databases for some dynamical system.
Keywords: Time series, fluctuation in statistical characteristics, optimal learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18127456 AudioMine: Medical Data Mining in Heterogeneous Audiology Records
Authors: Shaun Cox, Michael Oakes, Stefan Wermter, Maurice Hawthorne
Abstract:
We report on the results of a pilot study in which a data-mining tool was developed for mining audiology records. The records were heterogeneous in that they contained numeric, category and textual data. The tools developed are designed to observe associations between any field in the records and any other field. The techniques employed were the statistical chi-squared test, and the use of self-organizing maps, an unsupervised neural learning approach.
Keywords: Audiology, data mining, chi-squared, self-organizing maps
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16717455 Fuzzy Types Clustering for Microarray Data
Authors: Seo Young Kim, Tai Myong Choi
Abstract:
The main goal of microarray experiments is to quantify the expression of every object on a slide as precisely as possible, with a further goal of clustering the objects. Recently, many studies have discussed clustering issues involving similar patterns of gene expression. This paper presents an application of fuzzy-type methods for clustering DNA microarray data that can be applied to typical comparisons. Clustering and analyses were performed on microarray and simulated data. The results show that fuzzy-possibility c-means clustering substantially improves the findings obtained by others.Keywords: Clustering, microarray data, Fuzzy-type clustering, Validation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15217454 Robust Regression and its Application in Financial Data Analysis
Authors: Mansoor Momeni, Mahmoud Dehghan Nayeri, Ali Faal Ghayoumi, Hoda Ghorbani
Abstract:
This research is aimed to describe the application of robust regression and its advantages over the least square regression method in analyzing financial data. To do this, relationship between earning per share, book value of equity per share and share price as price model and earning per share, annual change of earning per share and return of stock as return model is discussed using both robust and least square regressions, and finally the outcomes are compared. Comparing the results from the robust regression and the least square regression shows that the former can provide the possibility of a better and more realistic analysis owing to eliminating or reducing the contribution of outliers and influential data. Therefore, robust regression is recommended for getting more precise results in financial data analysis.
Keywords: Financial data analysis, Influential data, Outliers, Robust regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19327453 Hierarchical Checkpoint Protocol in Data Grids
Authors: Rahma Souli-Jbali, Minyar Sassi Hidri, Rahma Ben Ayed
Abstract:
Grid of computing nodes has emerged as a representative means of connecting distributed computers or resources scattered all over the world for the purpose of computing and distributed storage. Since fault tolerance becomes complex due to the availability of resources in decentralized grid environment, it can be used in connection with replication in data grids. The objective of our work is to present fault tolerance in data grids with data replication-driven model based on clustering. The performance of the protocol is evaluated with Omnet++ simulator. The computational results show the efficiency of our protocol in terms of recovery time and the number of process in rollbacks.Keywords: Data grids, fault tolerance, chandy-lamport, clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9517452 Altered Network Organization in Mild Alzheimer's Disease Compared to Mild Cognitive Impairment Using Resting-State EEG
Authors: Chia-Feng Lu, Yuh-Jen Wang, Shin Teng, Yu-Te Wu, Sui-Hing Yan
Abstract:
Brain functional networks based on resting-state EEG data were compared between patients with mild Alzheimer’s disease (mAD) and matched patients with amnestic subtype of mild cognitive impairment (aMCI). We integrated the time–frequency cross mutual information (TFCMI) method to estimate the EEG functional connectivity between cortical regions and the network analysis based on graph theory to further investigate the alterations of functional networks in mAD compared with aMCI group. We aimed at investigating the changes of network integrity, local clustering, information processing efficiency, and fault tolerance in mAD brain networks for different frequency bands based on several topological properties, including degree, strength, clustering coefficient, shortest path length, and efficiency. Results showed that the disruptions of network integrity and reductions of network efficiency in mAD characterized by lower degree, decreased clustering coefficient, higher shortest path length, and reduced global and local efficiencies in the delta, theta, beta2, and gamma bands were evident. The significant changes in network organization can be used in assisting discrimination of mAD from aMCI in clinical.
Keywords: EEG, functional connectivity, graph theory, TFCMI.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25057451 Fuzzy Based Problem-Solution Data Structureas a Data Oriented Model for ABS Controlling
Authors: Ahmad Habibizad Navin, Mehdi Naghian Fesharaki, Mohamad Teshnelab, Ehsan Shahamatnia
Abstract:
The anti-lock braking systems installed on vehicles for safe and effective braking, are high-order nonlinear and timevariant. Using fuzzy logic controllers increase efficiency of such systems, but impose a high computational complexity as well. The main concept introduced by this paper is reducing computational complexity of fuzzy controllers by deploying problem-solution data structure. Unlike conventional methods that are based on calculations, this approach is based on data oriented modeling.Keywords: ABS, Fuzzy controller, PSDS, Time-Memory tradeoff, Data oriented modeling.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17367450 Methods and Algorithms of Ensuring Data Privacy in AI-Based Healthcare Systems and Technologies
Authors: Omar Farshad Jeelani, Makaire Njie, Viktoriia M. Korzhuk
Abstract:
Recently, the application of AI-powered algorithms in healthcare continues to flourish. Particularly, access to healthcare information, including patient health history, diagnostic data, and PII (Personally Identifiable Information) is paramount in the delivery of efficient patient outcomes. However, as the exchange of healthcare information between patients and healthcare providers through AI-powered solutions increases, protecting a person’s information and their privacy has become even more important. Arguably, the increased adoption of healthcare AI has resulted in a significant concentration on the security risks and protection measures to the security and privacy of healthcare data, leading to escalated analyses and enforcement. Since these challenges are brought by the use of AI-based healthcare solutions to manage healthcare data, AI-based data protection measures are used to resolve the underlying problems. Consequently, these projects propose AI-powered safeguards and policies/laws to protect the privacy of healthcare data. The project present the best-in-school techniques used to preserve data privacy of AI-powered healthcare applications. Popular privacy-protecting methods like Federated learning, cryptography techniques, differential privacy methods, and hybrid methods are discussed together with potential cyber threats, data security concerns, and prospects. Also, the project discusses some of the relevant data security acts/laws that govern the collection, storage, and processing of healthcare data to guarantee owners’ privacy is preserved. This inquiry discusses various gaps and uncertainties associated with healthcare AI data collection procedures, and identifies potential correction/mitigation measures.
Keywords: Data privacy, artificial intelligence, healthcare AI, data sharing, healthcare organizations.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 114