Search results for: XML Data Management
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9290

Search results for: XML Data Management

7700 Applying Fuzzy FP-Growth to Mine Fuzzy Association Rules

Authors: Chien-Hua Wang, Wei-Hsuan Lee, Chin-Tzong Pang

Abstract:

In data mining, the association rules are used to find for the associations between the different items of the transactions database. As the data collected and stored, rules of value can be found through association rules, which can be applied to help managers execute marketing strategies and establish sound market frameworks. This paper aims to use Fuzzy Frequent Pattern growth (FFP-growth) to derive from fuzzy association rules. At first, we apply fuzzy partition methods and decide a membership function of quantitative value for each transaction item. Next, we implement FFP-growth to deal with the process of data mining. In addition, in order to understand the impact of Apriori algorithm and FFP-growth algorithm on the execution time and the number of generated association rules, the experiment will be performed by using different sizes of databases and thresholds. Lastly, the experiment results show FFPgrowth algorithm is more efficient than other existing methods.

Keywords: Data mining, association rule, fuzzy frequent patterngrowth.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1788
7699 Investigating Crime Hotspot Places and their Implication to Urban Environmental Design: A Geographic Visualization and Data Mining Approach

Authors: Donna R. Tabangin, Jacqueline C. Flores, Nelson F. Emperador

Abstract:

Information is power. Geographical information is an emerging science that is advancing the development of knowledge to further help in the understanding of the relationship of “place" with other disciplines such as crime. The researchers used crime data for the years 2004 to 2007 from the Baguio City Police Office to determine the incidence and actual locations of crime hotspots. Combined qualitative and quantitative research methodology was employed through extensive fieldwork and observation, geographic visualization with Geographic Information Systems (GIS) and Global Positioning Systems (GPS), and data mining. The paper discusses emerging geographic visualization and data mining tools and methodologies that can be used to generate baseline data for environmental initiatives such as urban renewal and rejuvenation. The study was able to demonstrate that crime hotspots can be computed and were seen to be occurring to some select places in the Central Business District (CBD) of Baguio City. It was observed that some characteristics of the hotspot places- physical design and milieu may play an important role in creating opportunities for crime. A list of these environmental attributes was generated. This derived information may be used to guide the design or redesign of the urban environment of the City to be able to reduce crime and at the same time improve it physically.

Keywords: Crime mapping, data mining, environmental design, geographic visualization, GIS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2589
7698 Methodologies for Management of Sustainable Tourism: A Case Study in Jalapão/Tocantins/Brazil

Authors: Mary L. G. S. Senna, Veruska C. Dutra, Afonso R. Aquino

Abstract:

The study is in application and analysis of two tourism management tools that can contribute to making public managers decision: the Barometer of Tourism Sustainability (BTS) and the Ecological Footprint (EF). The results have shown that BTS allows you to have an integrated view of the tourism system, awakening to the need for planning of appropriate actions so that it can achieve the positive scale proposed (potentially sustainable). Already the methodology of ecological tourism footprint is an important tool to measure potential impacts generated by tourism to tourist reality.

Keywords: Barometer of tourism sustainability, ecological footprint of tourism, Jalapão/Brazil, sustainable tourism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4820
7697 Development of a Smart System for Measuring Strain Levels of Natural Gas and Petroleum Pipelines on Earthquake Fault Lines in Türkiye

Authors: Ahmet Yetik, Seyit Ali Kara, Cevat Özarpa

Abstract:

Load changes occur on natural gas and oil pipelines due to natural disasters. The displacement of the soil around the natural gas and oil pipes due to situations that may cause erosion, such as earthquakes, landslides, and floods, is the source of this load change. The exposure of natural gas and oil pipes to variable loads causes deformation, cracks, and breaks in these pipes. Such cracks and breaks can cause significant damage to people and the environment, including the risk of explosions. Especially with the examinations made after natural disasters, it can be easily understood which of the pipes has sustained more damage in those quake-affected regions. It has been determined that earthquakes in Türkiye have caused permanent damage to pipelines. This project was initiated in response to the identification of cracks and gas leaks in the insulation gaskets placed in the pipelines, especially at the junction points. In this study, a SCADA (Supervisory Control and Data Acquisition) application has been developed to monitor load changes caused by natural disasters. The developed SCADA application monitors the changes in the x, y, and z axes of the stresses occurring in the pipes with the help of strain gauge sensors placed on the pipes. For the developed SCADA system, test setups in accordance with the standards were created during the fieldwork. The test setups created were integrated into the SCADA system, and the system was followed up. Thanks to the SCADA system developed with the field application, the load changes that will occur on the natural gas and oil pipes are instantly monitored, and the accumulations that may create a load on the pipes and their surroundings are immediately intervened, and new risks that may arise are prevented. It has contributed to energy supply security, asset management, pipeline holistic management, and overall sustainability in the industry.

Keywords: Earthquake, natural gas pipes, oil pipes, voltage measurement, landslide.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 71
7696 Learning and Evaluating Possibilistic Decision Trees using Information Affinity

Authors: Ilyes Jenhani, Salem Benferhat, Zied Elouedi

Abstract:

This paper investigates the issue of building decision trees from data with imprecise class values where imprecision is encoded in the form of possibility distributions. The Information Affinity similarity measure is introduced into the well-known gain ratio criterion in order to assess the homogeneity of a set of possibility distributions representing instances-s classes belonging to a given training partition. For the experimental study, we proposed an information affinity based performance criterion which we have used in order to show the performance of the approach on well-known benchmarks.

Keywords: Data mining from uncertain data, Decision Trees, Possibility Theory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1498
7695 A Local Decisional Algorithm Using Agent- Based Management in Constrained Energy Environment

Authors: C. Adam, G. Henri, T. Levent, J-B Mauro, A-L Mayet

Abstract:

Energy Efficiency Management is the heart of a worldwide problem. The capability of a multi-agent system as a technology to manage the micro-grid operation has already been proved. This paper deals with the implementation of a decisional pattern applied to a multi-agent system which provides intelligence to a distributed local energy network considered at local consumer level. Development of multi-agent application involves agent specifications, analysis, design, and realization. Furthermore, it can be implemented by following several decisional patterns. The purpose of present article is to suggest a new approach for a decisional pattern involving a multi-agent system to control a distributed local energy network in a decentralized competitive system. The proposed solution is the result of a dichotomous approach based on environment observation. It uses an iterative process to solve automatic learning problems and converges monotonically very fast to system attracting operation point.

Keywords: Energy Efficiency Management, Distributed Smart- Grid, Multi-Agent System, Decisional Decentralized Competitive System.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1398
7694 Instant Location Detection of Objects Moving at High-Speedin C-OTDR Monitoring Systems

Authors: Andrey V. Timofeev

Abstract:

The practical efficient approach is suggested to estimate the high-speed objects instant bounds in C-OTDR monitoring systems. In case of super-dynamic objects (trains, cars) is difficult to obtain the adequate estimate of the instantaneous object localization because of estimation lag. In other words, reliable estimation coordinates of monitored object requires taking some time for data observation collection by means of C-OTDR system, and only if the required sample volume will be collected the final decision could be issued. But it is contrary to requirements of many real applications. For example, in rail traffic management systems we need to get data of the dynamic objects localization in real time. The way to solve this problem is to use the set of statistical independent parameters of C-OTDR signals for obtaining the most reliable solution in real time. The parameters of this type we can call as «signaling parameters» (SP). There are several the SP’s which carry information about dynamic objects instant localization for each of COTDR channels. The problem is that some of these parameters are very sensitive to dynamics of seismoacoustic emission sources, but are non-stable. On the other hand, in case the SP is very stable it becomes insensitive as rule. This report contains describing of the method for SP’s co-processing which is designed to get the most effective dynamic objects localization estimates in the C-OTDR monitoring system framework.

Keywords: C-OTDR-system, co-processing of signaling parameters, high-speed objects localization, multichannel monitoring systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1878
7693 Introduction to Techno-Sectoral Innovation System Modeling and Functions Formulating

Authors: S. M. Azad, H. Ghodsipour, F. Roshannafas

Abstract:

In recent years ‘technology management and policymaking’ is one of the most important problems in management science. In this field, different generations of innovation and technology management are presented which the earliest one is Innovation System (IS) approach. In a general classification, innovation systems are divided in to 4 approaches: technical, sectoral, regional, and national. There are many researches in relation to each of these approaches in different academic fields. Every approach has some benefits. If two or more approaches hybrid, their benefits would be combined. In addition, according to the sectoral structure of the governance model in Iran, in many sectors, such as information technology, the combination of three other approaches with sectoral approach is essential. Hence, in this paper, combining two IS approaches (technical and sectoral) and using system dynamics, a generic model is presented for a sample of software industry. As a complimentary point, this article is introducing a new hybrid approach called Techno-Sectoral Innovation System. This TSIS model is accomplished by Changing concepts of the ‘functions’-which came from Technological IS literature- and using them into sectoral system as measurable indicators.

Keywords: Innovation system, technology, techno-sectoral system, functional indicators, system dynamics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1751
7692 Expert Based System Design for Integrated Waste Management

Authors: A. Buruzs, M. F. Hatwágner, A. Torma, L. T. Kóczy

Abstract:

Recently, an increasing number of researchers have been focusing on working out realistic solutions to sustainability problems. As sustainability issues gain higher importance for organisations, the management of such decisions becomes critical. Knowledge representation is a fundamental issue of complex knowledge based systems. Many types of sustainability problems would benefit from models based on experts’ knowledge. Cognitive maps have been used for analyzing and aiding decision making. A cognitive map can be made of almost any system or problem. A fuzzy cognitive map (FCM) can successfully represent knowledge and human experience, introducing concepts to represent the essential elements and the cause and effect relationships among the concepts to model the behaviour of any system. Integrated waste management systems (IWMS) are complex systems that can be decomposed to non-related and related subsystems and elements, where many factors have to be taken into consideration that may be complementary, contradictory, and competitive; these factors influence each other and determine the overall decision process of the system. The goal of the present paper is to construct an efficient IWMS which considers various factors. The authors’ intention is to propose an expert based system design approach for implementing expert decision support in the area of IWMSs and introduces an appropriate methodology for the development and analysis of group FCM. A framework for such a methodology consisting of the development and application phases is presented.

Keywords: Factors, fuzzy cognitive map, group decision, integrated waste management system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1941
7691 Regression Approach for Optimal Purchase of Hosts Cluster in Fixed Fund for Hadoop Big Data Platform

Authors: Haitao Yang, Jianming Lv, Fei Xu, Xintong Wang, Yilin Huang, Lanting Xia, Xuewu Zhu

Abstract:

Given a fixed fund, purchasing fewer hosts of higher capability or inversely more of lower capability is a must-be-made trade-off in practices for building a Hadoop big data platform. An exploratory study is presented for a Housing Big Data Platform project (HBDP), where typical big data computing is with SQL queries of aggregate, join, and space-time condition selections executed upon massive data from more than 10 million housing units. In HBDP, an empirical formula was introduced to predict the performance of host clusters potential for the intended typical big data computing, and it was shaped via a regression approach. With this empirical formula, it is easy to suggest an optimal cluster configuration. The investigation was based on a typical Hadoop computing ecosystem HDFS+Hive+Spark. A proper metric was raised to measure the performance of Hadoop clusters in HBDP, which was tested and compared with its predicted counterpart, on executing three kinds of typical SQL query tasks. Tests were conducted with respect to factors of CPU benchmark, memory size, virtual host division, and the number of element physical host in cluster. The research has been applied to practical cluster procurement for housing big data computing.

Keywords: Hadoop platform planning, optimal cluster scheme at fixed-fund, performance empirical formula, typical SQL query tasks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 820
7690 Analysis of Palm Perspiration Effect with SVM for Diabetes in People

Authors: Hamdi Melih Saraoğlu, Muhlis Yıldırım, Abdurrahman Özbeyaz, Feyzullah Temurtas

Abstract:

In this research, the diabetes conditions of people (healthy, prediabete and diabete) were tried to be identified with noninvasive palm perspiration measurements. Data clusters gathered from 200 subjects were used (1.Individual Attributes Cluster and 2. Palm Perspiration Attributes Cluster). To decrase the dimensions of these data clusters, Principal Component Analysis Method was used. Data clusters, prepared in that way, were classified with Support Vector Machines. Classifications with highest success were 82% for Glucose parameters and 84% for HbA1c parametres.

Keywords: Palm perspiration, Diabetes, Support Vector Machine, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1917
7689 Software Engineering Inspired Cost Estimation for Process Modelling

Authors: Felix Baumann, Aleksandar Milutinovic, Dieter Roller

Abstract:

Up to this point business process management projects in general and business process modelling projects in particular could not rely on a practical and scientifically validated method to estimate cost and effort. Especially the model development phase is not covered by a cost estimation method or model. Further phases of business process modelling starting with implementation are covered by initial solutions which are discussed in the literature. This article proposes a method of filling this gap by deriving a cost estimation method from available methods in similar domains namely software development or software engineering. Software development is regarded as closely similar to process modelling as we show. After the proposition of this method different ideas for further analysis and validation of the method are proposed. We derive this method from COCOMO II and Function Point which are established methods of effort estimation in the domain of software development. For this we lay out similarities of the software development process and the process of process modelling which is a phase of the Business Process Management life-cycle.

Keywords: Cost Estimation, Effort Estimation, Process Modelling, Business Process Management, COCOMO.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2266
7688 Elemental Graph Data Model: A Semantic and Topological Representation of Building Elements

Authors: Yasmeen A. S. Essawy, Khaled Nassar

Abstract:

With the rapid increase of complexity in the building industry, professionals in the A/E/C industry were forced to adopt Building Information Modeling (BIM) in order to enhance the communication between the different project stakeholders throughout the project life cycle and create a semantic object-oriented building model that can support geometric-topological analysis of building elements during design and construction. This paper presents a model that extracts topological relationships and geometrical properties of building elements from an existing fully designed BIM, and maps this information into a directed acyclic Elemental Graph Data Model (EGDM). The model incorporates BIM-based search algorithms for automatic deduction of geometrical data and topological relationships for each building element type. Using graph search algorithms, such as Depth First Search (DFS) and topological sortings, all possible construction sequences can be generated and compared against production and construction rules to generate an optimized construction sequence and its associated schedule. The model is implemented in a C# platform.

Keywords: Building information modeling, elemental graph data model, geometric and topological data models, and graph theory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1176
7687 Crossing Borders: In Research and Business Communication

Authors: E. Podhovnik

Abstract:

Cultures play a role in business communication and in research. At the example of language in international business, this paper addresses the issue of how the research cultures of management research and linguistics as well as cultures as such can be linked. After looking at existing research on language in international business, this paper approaches communication in international business from a linguistic angle and attempts to explain communication issues in businesses based on linguistic research. Thus the paper makes a step into cross-disciplinary research combining management research with linguistics.

Keywords: Language in international business, sociolinguistics, ethnopragmatics, cultural scripts.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2347
7686 A Materialized View Approach to Support Aggregation Operations over Long Periods in Sensor Networks

Authors: Minsoo Lee, Julee Choi, Sookyung Song

Abstract:

The increasing interest on processing data created by sensor networks has evolved into approaches to implement sensor networks as databases. The aggregation operator, which calculates a value from a large group of data such as computing averages or sums, etc. is an essential function that needs to be provided when implementing such sensor network databases. This work proposes to add the DURING clause into TinySQL to calculate values during a specific long period and suggests a way to implement the aggregation service in sensor networks by applying materialized view and incremental view maintenance techniques that is used in data warehouses. In sensor networks, data values are passed from child nodes to parent nodes and an aggregation value is computed at the root node. As such root nodes need to be memory efficient and low powered, it becomes a problem to recompute aggregate values from all past and current data. Therefore, applying incremental view maintenance techniques can reduce the memory consumption and support fast computation of aggregate values.

Keywords: Aggregation, Incremental View Maintenance, Materialized view, Sensor Network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1523
7685 Real Time Data Communication with FlightGear Using Simulink over a UDP Protocol

Authors: Adil Loya, Ali Haider, Arslan A. Ghaffor, Abubaker Siddique

Abstract:

Simulation and modelling of Unmanned Aerial Vehicle (UAV) has gained wide popularity in front of aerospace community. The demand of designing and modelling optimized control system for UAV has increased ten folds since last decade, as next generation warfare is dependent on unmanned technologies. Therefore, this research focuses on the simulation of nonlinear UAV dynamics on Simulink and its integration with Flightgear. There has been lots of research on implementation of optimizing control using Simulink, however, there are fewer known techniques to simulate these dynamics over Flightgear and a tedious technique of acquiring data has been tackled in this research horizon. Sending data to Flightgear is easy but receiving it from Simulink is not that straight forward, i.e. we can only receive control data on the output. However, in this research we have managed to get the data out from the Flightgear by implementation of level 2 s-function block within Simulink. Moreover, the results captured from Flightgear over a Universal Datagram Protocol (UDP) communication are then compared with the attitude signal that were sent previously. This provide useful information regarding the difference in outputs attained from Simulink to Flightgear. It was found that values received on Simulink were in high agreement with that of the Flightgear output. And complete study has been conducted in a discrete way.

Keywords: aerospace, flight control, FlightGear, communication, Simulink

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1104
7684 Implementing Knowledge Transfer Solution through Web-based Help Desk System

Authors: Mazeyanti M. Ariffin, Noreen Izza Arshad, Ainol Rahmah Shaarani, Syed Uzair Shah

Abstract:

Knowledge management is a process taking any steps that needed to get the most out of available knowledge resources. KM involved several steps; capturing the knowledge discovering new knowledge, sharing the knowledge and applied the knowledge in the decision making process. In applying the knowledge, it is not necessary for the individual that use the knowledge to comprehend it as long as the available knowledge is used in guiding the decision making and actions. When an expert is called and he provides stepby- step procedure on how to solve the problems to the caller, the expert is transferring the knowledge or giving direction to the caller. And the caller is 'applying' the knowledge by following the instructions given by the expert. An appropriate mechanism is needed to ensure effective knowledge transfer which in this case is by telephone or email. The problem with email and telephone is that the knowledge is not fully circulated and disseminated to all users. In this paper, with related experience of local university Help Desk, it is proposed the usage of Information Technology (IT)to effectively support the knowledge transfer in the organization. The issues covered include the existing knowledge, the related works, the methodology used in defining the knowledge management requirements as well the overview of the prototype.

Keywords: Knowledge Management, Knowledge Transfer, Help Desk, Web-based system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1757
7683 Comparing Data Analysis, Communication and Information Technologies Expertise Levels in Undergraduate Psychology Students

Authors: Ana Cázares

Abstract:

Aims for this study: first, to compare the expertise level in data analysis, communication and information technologies in undergraduate psychology students. Second, to verify the factor structure of E-ETICA (Escala de Experticia en Tecnologias de la Informacion, la Comunicacion y el Análisis or Data Analysis, Communication and Information'Expertise Scale) which had shown an excellent internal consistency (α= 0.92) as well as a simple factor structure. Three factors, Complex, Basic Information and Communications Technologies and E-Searching and Download Abilities, explains 63% of variance. In the present study, 260 students (119 juniors and 141 seniors) were asked to respond to ETICA (16 items Likert scale of five points 1: null domain to 5: total domain). The results show that both junior and senior students report having very similar expertise level; however, E-ETICA presents a different factor structure for juniors and four factors explained also 63% of variance: Information E-Searching, Download and Process; Data analysis; Organization; and Communication technologies.

Keywords: Data analysis, Information, Communications Technologies, Expertise'Levels.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1269
7682 Design and Development of Real-Time Optimal Energy Management System for Hybrid Electric Vehicles

Authors: Masood Roohi, Amir Taghavipour

Abstract:

This paper describes a strategy to develop an energy management system (EMS) for a charge-sustaining power-split hybrid electric vehicle. This kind of hybrid electric vehicles (HEVs) benefit from the advantages of both parallel and series architecture. However, it gets relatively more complicated to manage power flow between the battery and the engine optimally. The applied strategy in this paper is based on nonlinear model predictive control approach. First of all, an appropriate control-oriented model which was accurate enough and simple was derived. Towards utilization of this controller in real-time, the problem was solved off-line for a vast area of reference signals and initial conditions and stored the computed manipulated variables inside look-up tables. Look-up tables take a little amount of memory. Also, the computational load dramatically decreased, because to find required manipulated variables the controller just needed a simple interpolation between tables.

Keywords: Hybrid electric vehicles, energy management system, nonlinear model predictive control, real-time.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1353
7681 Omni: Data Science Platform for Evaluate Performance of a LoRaWAN Network

Authors: Emanuele A. Solagna, Ricardo S, Tozetto, Roberto dos S. Rabello

Abstract:

Nowadays, physical processes are becoming digitized by the evolution of communication, sensing and storage technologies which promote the development of smart cities. The evolution of this technology has generated multiple challenges related to the generation of big data and the active participation of electronic devices in society. Thus, devices can send information that is captured and processed over large areas, but there is no guarantee that all the obtained data amount will be effectively stored and correctly persisted. Because, depending on the technology which is used, there are parameters that has huge influence on the full delivery of information. This article aims to characterize the project, currently under development, of a platform that based on data science will perform a performance and effectiveness evaluation of an industrial network that implements LoRaWAN technology considering its main parameters configuration relating these parameters to the information loss.

Keywords: Internet of Things, LoRa, LoRaWAN, smart cities.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 689
7680 Integration of Image and Patient Data, Software and International Coding Systems for Use in a Mammography Research Project

Authors: V. Balanica, W. I. D. Rae, M. Caramihai, S. Acho, C. P. Herbst

Abstract:

Mammographic images and data analysis to facilitate modelling or computer aided diagnostic (CAD) software development should best be done using a common database that can handle various mammographic image file formats and relate these to other patient information. This would optimize the use of the data as both primary reporting and enhanced information extraction of research data could be performed from the single dataset. One desired improvement is the integration of DICOM file header information into the database, as an efficient and reliable source of supplementary patient information intrinsically available in the images. The purpose of this paper was to design a suitable database to link and integrate different types of image files and gather common information that can be further used for research purposes. An interface was developed for accessing, adding, updating, modifying and extracting data from the common database, enhancing the future possible application of the data in CAD processing. Technically, future developments envisaged include the creation of an advanced search function to selects image files based on descriptor combinations. Results can be further used for specific CAD processing and other research. Design of a user friendly configuration utility for importing of the required fields from the DICOM files must be done.

Keywords: Database Integration, Mammogram Classification, Tumour Classification, Computer Aided Diagnosis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1930
7679 Efficient Pre-Processing of Single-Cell Assay for Transposase Accessible Chromatin with High-Throughput Sequencing Data

Authors: Fan Gao, Lior Pachter

Abstract:

The primary tool currently used to pre-process 10X chromium single-cell ATAC-seq data is Cell Ranger, which can take very long to run on standard datasets. To facilitate rapid pre-processing that enables reproducible workflows, we present a suite of tools called scATAK for pre-processing single-cell ATAC-seq data that is 15 to 18 times faster than Cell Ranger on mouse and human samples. Our tool can also calculate chromatin interaction potential matrices and generate open chromatin signal and interaction traces for cell groups. We use scATAK tool to explore the chromatin regulatory landscape of a healthy adult human brain and unveil cell-type specific features, and show that it provides a convenient and computational efficient approach for pre-processing single-cell ATAC-seq data.

Keywords: single-cell, ATAC-seq, bioinformatics, open chromatin landscape, chromatin interactome

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1113
7678 Urban Big Data: An Experimental Approach to Building-Value Estimation Using Web-Based Data

Authors: Sun-Young Jang, Sung-Ah Kim, Dongyoun Shin

Abstract:

Current real-estate value estimation, difficult for laymen, usually is performed by specialists. This paper presents an automated estimation process based on big data and machine-learning technology that calculates influences of building conditions on real-estate price measurement. The present study analyzed actual building sales sample data for Nonhyeon-dong, Gangnam-gu, Seoul, Korea, measuring the major influencing factors among the various building conditions. Further to that analysis, a prediction model was established and applied using RapidMiner Studio, a graphical user interface (GUI)-based tool for derivation of machine-learning prototypes. The prediction model is formulated by reference to previous examples. When new examples are applied, it analyses and predicts accordingly. The analysis process discerns the crucial factors effecting price increases by calculation of weighted values. The model was verified, and its accuracy determined, by comparing its predicted values with actual price increases.

Keywords: Big data, building-value analysis, machine learning, price prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1149
7677 Production Structures of Energy Based on Water Force, Its Infrastructure Protection, and Possible Causes of Failure

Authors: Gabriela-Andreea Despescu, Mădălina-Elena Mavrodin, Gheorghe Lăzăroiu, Florin Adrian Grădinaru

Abstract:

The purpose of this paper is to contribute to the enhancement of a hydroelectric plant protection by coordinating protection measures / existing security and introducing new measures under a risk management process. In addition, plan identifies key critical elements of a hydroelectric plant, from its level vulnerabilities and threats it is subjected to in order to achieve the necessary protection measures to reduce the level of risk.

Keywords: Critical infrastructure, risk analysis, critical infrastructure protection, vulnerability, risk management, turbine, Impact analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1552
7676 Qualitative Data Analysis for Health Care Services

Authors: Taner Ersoz, Filiz Ersoz

Abstract:

This study was designed enable application of multivariate technique in the interpretation of categorical data for measuring health care services satisfaction in Turkey. The data was collected from a total of 17726 respondents. The establishment of the sample group and collection of the data were carried out by a joint team from The Ministry of Health and Turkish Statistical Institute (Turk Stat) of Turkey. The multiple correspondence analysis (MCA) was used on the data of 2882 respondents who answered the questionnaire in full. The multiple correspondence analysis indicated that, in the evaluation of health services females, public employees, younger and more highly educated individuals were more concerned and complainant than males, private sector employees, older and less educated individuals. Overall 53 % of the respondents were pleased with the improvements in health care services in the past three years. This study demonstrates the public consciousness in health services and health care satisfaction in Turkey. It was found that most the respondents were pleased with the improvements in health care services over the past three years. Awareness of health service quality increases with education levels. Older individuals and males would appear to have lower expectancies in health services.

Keywords: Multiple correspondence analysis, optimal scaling, multivariate categorical data, health care services, health satisfaction survey, statistical visualizing, Turkey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 861
7675 Exploiting Self-Adaptive Replication Management on Decentralized Tuple Space

Authors: Xing Jiankuan, Qin Zheng, Zhang Jinxue

Abstract:

Decentralized Tuple Space (DTS) implements tuple space model among a series of decentralized hosts and provides the logical global shared tuple repository. Replication has been introduced to promote performance problem incurred by remote tuple access. In this paper, we propose a replication approach of DTS allowing replication policies self-adapting. The accesses from users or other nodes are monitored and collected to contribute the decision making. The replication policy may be changed if the better performance is expected. The experiments show that this approach suitably adjusts the replication policies, which brings negligible overhead.

Keywords: Decentralization, Replication Management, SelfAdaption, Tuple Space.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1202
7674 Comparison of Different Types of Sources of Traffic Using SFQ Scheduling Discipline

Authors: Alejandro Gomez Suarez, H. Srikanth Kamath

Abstract:

In this paper, SFQ (Start Time Fair Queuing) algorithm is analyzed when this is applied in computer networks to know what kind of behavior the traffic in the net has when different data sources are managed by the scheduler. Using the NS2 software the computer networks were simulated to be able to get the graphs showing the performance of the scheduler. Different traffic sources were introduced in the scripts, trying to establish the real scenario. Finally the results were that depending on the data source, the traffic can be affected in different levels, when Constant Bite Rate is applied, the scheduler ensures a constant level of data sent and received, but the truth is that in the real life it is impossible to ensure a level that resists the changes in work load.

Keywords: Cbq, Cbr, Nam, Ns2.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2123
7673 Studies of Rule Induction by STRIM from the Decision Table with Contaminated Attribute Values from Missing Data and Noise — In the Case of Critical Dataset Size —

Authors: Tetsuro Saeki, Yuichi Kato, Shoutarou Mizuno

Abstract:

STRIM (Statistical Test Rule Induction Method) has been proposed as a method to effectively induct if-then rules from the decision table which is considered as a sample set obtained from the population of interest. Its usefulness has been confirmed by simulation experiments specifying rules in advance, and by comparison with conventional methods. However, scope for future development remains before STRIM can be applied to the analysis of real-world data sets. The first requirement is to determine the size of the dataset needed for inducting true rules, since finding statistically significant rules is the core of the method. The second is to examine the capacity of rule induction from datasets with contaminated attribute values created by missing data and noise, since real-world datasets usually contain such contaminated data. This paper examines the first problem theoretically, in connection with the rule length. The second problem is then examined in a simulation experiment, utilizing the critical size of dataset derived from the first step. The experimental results show that STRIM is highly robust in the analysis of datasets with contaminated attribute values, and hence is applicable to real-world data

Keywords: Rule induction, decision table, missing data, noise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1443
7672 The Comparison of Anchor and Star Schema from a Query Performance Perspective

Authors: Radek Němec

Abstract:

Today's business environment requires that companies have access to highly relevant information in a matter of seconds. Modern Business Intelligence tools rely on data structured mostly in traditional dimensional database schemas, typically represented by star schemas. Dimensional modeling is already recognized as a leading industry standard in the field of data warehousing although several drawbacks and pitfalls were reported. This paper focuses on the analysis of another data warehouse modeling technique - the anchor modeling, and its characteristics in context with the standardized dimensional modeling technique from a query performance perspective. The results of the analysis show information about performance of queries executed on database schemas structured according to principles of each database modeling technique.

Keywords: Data warehousing, anchor modeling, star schema, anchor schema, query performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3302
7671 Data about Loggerhead Sea Turtle (Caretta caretta) and Green Turtle (Chelonia mydas) in Vlora Bay, Albania

Authors: Enerit Saçdanaku, Idriz Haxhiu

Abstract:

This study was conducted in the area of Vlora Bay, Albania. Data about Sea Turtles Caretta caretta and Chelonia mydas, belonging to two periods of time (1984 – 1991; 2008 – 2014) are given. All data gathered were analyzed using recent methodologies. For all turtles captured (as by catch), the Curve Carapace Length (CCL) and Curved Carapace Width (CCW) were measured. These data were statistically analyzed, where the mean was 67.11 cm for CCL and 57.57 cm for CCW of all individuals studied (n=13). All untagged individuals of marine turtles were tagged using metallic tags (Stockbrand’s titanium tag) with an Albanian address. Sex was determined and resulted that 45.4% of individuals were females, 27.3% males and 27.3% juveniles. All turtles were studied for the presence of the epibionts. The area of Vlora Bay is used from marine turtles (Caretta caretta) as a migratory corridor to pass from Mediterranean to the northern part of the Adriatic Sea.

Keywords: Caretta caretta, Chelonia mydas, CCL, CCW, tagging, Vlora Bay.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2110