Search results for: Data Integration

7565 Data Transformation Services (DTS): Creating Data Mart by Consolidating Multi-Source Enterprise Operational Data

Authors: J. D. D. Daniel, K. N. Goh, S. M. Yusop

Abstract:

Trends in business intelligence, e-commerce and remote access make it necessary and practical to store data in different ways on multiple systems with different operating systems. As business evolve and grow, they require efficient computerized solution to perform data update and to access data from diverse enterprise business applications. The objective of this paper is to demonstrate the capability of DTS [1] as a database solution for automatic data transfer and update in solving business problem. This DTS package is developed for the sales of variety of plants and eventually expanded into commercial supply and landscaping business. Dimension data modeling is used in DTS package to extract, transform and load data from heterogeneous database systems such as MySQL, Microsoft Access and Oracle that consolidates into a Data Mart residing in SQL Server. Hence, the data transfer from various databases is scheduled to run automatically every quarter of the year to review the efficient sales analysis. Therefore, DTS is absolutely an attractive solution for automatic data transfer and update which meeting today-s business needs.

Keywords: Data Transformation Services (DTS), ObjectLinking and Embedding Database (OLEDB), Data Mart, OnlineAnalytical Processing (OLAP), Online Transactional Processing(OLTP).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1981

7564 Extraction of Data from Web Pages: A Vision Based Approach

Authors: P. S. Hiremath, Siddu P. Algur

Abstract:

With the explosive growth of information sources available on the World Wide Web, it has become increasingly difficult to identify the relevant pieces of information, since web pages are often cluttered with irrelevant content like advertisements, navigation-panels, copyright notices etc., surrounding the main content of the web page. Hence, tools for the mining of data regions, data records and data items need to be developed in order to provide value-added services. Currently available automatic techniques to mine data regions from web pages are still unsatisfactory because of their poor performance and tag-dependence. In this paper a novel method to extract data items from the web pages automatically is proposed. It comprises of two steps: (1) Identification and Extraction of the data regions based on visual clues information. (2) Identification of data records and extraction of data items from a data region. For step1, a novel and more effective method is proposed based on visual clues, which finds the data regions formed by all types of tags using visual clues. For step2 a more effective method namely, Extraction of Data Items from web Pages (EDIP), is adopted to mine data items. The EDIP technique is a list-based approach in which the list is a linear data structure. The proposed technique is able to mine the non-contiguous data records and can correctly identify data regions, irrespective of the type of tag in which it is bound. Our experimental results show that the proposed technique performs better than the existing techniques.

Keywords: Web data records, web data regions, web mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1860

7563 Visual-Graphical Methods for Exploring Longitudinal Data

Authors: H. W. Ker

Abstract:

Longitudinal data typically have the characteristics of changes over time, nonlinear growth patterns, between-subjects variability, and the within errors exhibiting heteroscedasticity and dependence. The data exploration is more complicated than that of cross-sectional data. The purpose of this paper is to organize/integrate of various visual-graphical techniques to explore longitudinal data. From the application of the proposed methods, investigators can answer the research questions include characterizing or describing the growth patterns at both group and individual level, identifying the time points where important changes occur and unusual subjects, selecting suitable statistical models, and suggesting possible within-error variance.

Keywords: Data exploration, exploratory analysis, HLMs/LMEs, longitudinal data, visual-graphical methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2045

7562 Data-Driven Decision-Making in Digital Entrepreneurship

Authors: Abeba Nigussie Turi, Xiangming Samuel Li

Abstract:

Data-driven business models are more typical for established businesses than early-stage startups that strive to penetrate a market. This paper provided an extensive discussion on the principles of data analytics for early-stage digital entrepreneurial businesses. Here, we developed data-driven decision-making (DDDM) framework that applies to startups prone to multifaceted barriers in the form of poor data access, technical and financial constraints, to state some. The startup DDDM framework proposed in this paper is novel in its form encompassing startup data analytics enablers and metrics aligning with startups' business models ranging from customer-centric product development to servitization which is the future of modern digital entrepreneurship.

Keywords: Startup data analytics, data-driven decision-making, data acquisition, data generation, digital entrepreneurship.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 743

7561 Wearable Sensing Application- Carbon Dioxide Monitoring for Emergency Personnel Using Wearable Sensors

Authors: Tanja Radu, Cormac Fay, King Tong Lau, Rhys Waite, Dermot Diamond

Abstract:

The development of wearable sensing technologies is a great challenge which is being addressed by the Proetex FP6 project (www.proetex.org). Its main aim is the development of wearable sensors to improve the safety and efficiency of emergency personnel. This will be achieved by continuous, real-time monitoring of vital signs, posture, activity, and external hazards surrounding emergency workers. We report here the development of carbon dioxide (CO2) sensing boot by incorporating commercially available CO2 sensor with a wireless platform into the boot assembly. Carefully selected commercially available sensors have been tested. Some of the key characteristics of the selected sensors are high selectivity and sensitivity, robustness and the power demand. This paper discusses some of the results of CO2 sensor tests and sensor integration with wireless data transmission

Keywords: Proetex, gas sensing, wireless, wearable sensors, carbon dioxide

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1534

7560 Classifying Bio-Chip Data using an Ant Colony System Algorithm

Authors: Minsoo Lee, Yearn Jeong Kim, Yun-mi Kim, Sujeung Cheong, Sookyung Song

Abstract:

Bio-chips are used for experiments on genes and contain various information such as genes, samples and so on. The two-dimensional bio-chips, in which one axis represent genes and the other represent samples, are widely being used these days. Instead of experimenting with real genes which cost lots of money and much time to get the results, bio-chips are being used for biological experiments. And extracting data from the bio-chips with high accuracy and finding out the patterns or useful information from such data is very important. Bio-chip analysis systems extract data from various kinds of bio-chips and mine the data in order to get useful information. One of the commonly used methods to mine the data is classification. The algorithm that is used to classify the data can be various depending on the data types or number characteristics and so on. Considering that bio-chip data is extremely large, an algorithm that imitates the ecosystem such as the ant algorithm is suitable to use as an algorithm for classification. This paper focuses on finding the classification rules from the bio-chip data using the Ant Colony algorithm which imitates the ecosystem. The developed system takes in consideration the accuracy of the discovered rules when it applies it to the bio-chip data in order to predict the classes.

Keywords: Ant Colony System, DNA chip data, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1422

7559 An Efficient Framework to Build Up Malware Dataset

Authors: Madihah Mohd Saudi, Zul Hilmi Abdullah

Abstract:

This research paper presents a framework on how to build up malware dataset.Many researchers took longer time to clean the dataset from any noise or to transform the dataset into a format that can be used straight away for testing. Therefore, this research is proposing a framework to help researchers to speed up the malware dataset cleaningprocesses which later can be used for testing. It is believed, an efficient malware dataset cleaning processes, can improved the quality of the data, thus help to improve the accuracy and the efficiency of the subsequent analysis. Apart from that, an in-depth understanding of the malware taxonomy is also important prior and during the dataset cleaning processes. A new Trojan classification has been proposed to complement this framework.This experiment has been conducted in a controlled lab environment and using the dataset from VxHeavens dataset. This framework is built based on the integration of static and dynamic analyses, incident response method and knowledge database discovery (KDD) processes.This framework can be used as the basis guideline for malware researchers in building malware dataset.

Keywords: Dataset, knowledge database discovery (KDD), malware, static and dynamic analyses.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3427

7558 A Framework for Designing Complex Product- Service Systems with a Multi-Domain Matrix

Authors: Yoonjung An, Yongtae Park

Abstract:

Offering a Product-Service System (PSS) is a well-accepted strategy that companies may adopt to provide a set of systemic solutions to customers. PSSs were initially provided in a simple form but now take diversified and complex forms involving multiple services, products and technologies. With the growing interest in the PSS, frameworks for the PSS development have been introduced by many researchers. However, most of the existing frameworks fail to examine various relations existing in a complex PSS. Since designing a complex PSS involves full integration of multiple products and services, it is essential to identify not only product-service relations but also product-product/ service-service relations. It is also equally important to specify how they are related for better understanding of the system. Moreover, as customers tend to view their purchase from a more holistic perspective, a PSS should be developed based on the whole system’s requirements, rather than focusing only on the product requirements or service requirements. Thus, we propose a framework to develop a complex PSS that is coordinated fully with the requirements of both worlds. Specifically, our approach adopts a multi-domain matrix (MDM). A MDM identifies not only inter-domain relations but also intra-domain relations so that it helps to design a PSS that includes highly desired and closely related core functions/ features. Also, various dependency types and rating schemes proposed in our approach would help the integration process.

Keywords: Inter-domain relations, intra-domain relations, multi-domain matrix, product-service system design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2384

7557 Solving Process Planning, Weighted Earliest Due Date Scheduling and Weighted Due Date Assignment Using Simulated Annealing and Evolutionary Strategies

Authors: Halil Ibrahim Demir, Abdullah Hulusi Kokcam, Fuat Simsir, Özer Uygun

Abstract:

Traditionally, three important manufacturing functions which are process planning, scheduling and due-date assignment are performed sequentially and separately. Although there are numerous works on the integration of process planning and scheduling and plenty of works focusing on scheduling with due date assignment, there are only a few works on integrated process planning, scheduling and due-date assignment. Although due-dates are determined without taking into account of weights of the customers in the literature, here weighted due-date assignment is employed to get better performance. Jobs are scheduled according to weighted earliest due date dispatching rule and due dates are determined according to some popular due date assignment methods by taking into account of the weights of each job. Simulated Annealing, Evolutionary Strategies, Random Search, hybrid of Random Search and Simulated Annealing, and hybrid of Random Search and Evolutionary Strategies, are applied as solution techniques. Three important manufacturing functions are integrated step-by-step and higher integration levels are found better. Search meta-heuristics are found to be very useful while improving performance measure.

Keywords: Evolutionary strategies, hybrid searches, process planning, simulated annealing, weighted due-date assignment, weighted scheduling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1117

7556 Trust and Reliability for Public Sector Data

Authors: Klaus Stranacher, Vesna Krnjic, Thomas Zefferer

Abstract:

The public sector holds large amounts of data of various areas such as social affairs, economy, or tourism. Various initiatives such as Open Government Data or the EU Directive on public sector information aim to make these data available for public and private service providers. Requirements for the provision of public sector data are defined by legal and organizational frameworks. Surprisingly, the defined requirements hardly cover security aspects such as integrity or authenticity. In this paper we discuss the importance of these missing requirements and present a concept to assure the integrity and authenticity of provided data based on electronic signatures. We show that our concept is perfectly suitable for the provisioning of unaltered data. We also show that our concept can also be extended to data that needs to be anonymized before provisioning by incorporating redactable signatures. Our proposed concept enhances trust and reliability of provided public sector data.

Keywords: Trusted Public Sector Data, Integrity, Authenticity, Reliability, Redactable Signatures.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1723

7555 Forecasting Direct Normal Irradiation at Djibouti Using Artificial Neural Network

Authors: Ahmed Kayad Abdourazak, Abderafi Souad, Zejli Driss, Idriss Abdoulkader Ibrahim

Abstract:

In this paper Artificial Neural Network (ANN) is used to predict the solar irradiation in Djibouti for the first Time that is useful to the integration of Concentrating Solar Power (CSP) and sites selections for new or future solar plants as part of solar energy development. An ANN algorithm was developed to establish a forward/reverse correspondence between the latitude, longitude, altitude and monthly solar irradiation. For this purpose the German Aerospace Centre (DLR) data of eight Djibouti sites were used as training and testing in a standard three layers network with the back propagation algorithm of Lavenber-Marquardt. Results have shown a very good agreement for the solar irradiation prediction in Djibouti and proves that the proposed approach can be well used as an efficient tool for prediction of solar irradiation by providing so helpful information concerning sites selection, design and planning of solar plants.

Keywords: Artificial neural network, solar irradiation, concentrated solar power, Lavenberg-Marquardt.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1022

7554 Analysis of Relation between Unlabeled and Labeled Data to Self-Taught Learning Performance

Authors: Ekachai Phaisangittisagul, Rapeepol Chongprachawat

Abstract:

Obtaining labeled data in supervised learning is often difficult and expensive, and thus the trained learning algorithm tends to be overfitting due to small number of training data. As a result, some researchers have focused on using unlabeled data which may not necessary to follow the same generative distribution as the labeled data to construct a high-level feature for improving performance on supervised learning tasks. In this paper, we investigate the impact of the relationship between unlabeled and labeled data for classification performance. Specifically, we will apply difference unlabeled data which have different degrees of relation to the labeled data for handwritten digit classification task based on MNIST dataset. Our experimental results show that the higher the degree of relation between unlabeled and labeled data, the better the classification performance. Although the unlabeled data that is completely from different generative distribution to the labeled data provides the lowest classification performance, we still achieve high classification performance. This leads to expanding the applicability of the supervised learning algorithms using unsupervised learning.

Keywords: Autoencoder, high-level feature, MNIST dataset, selftaught learning, supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1778

7553 Towards Development of Solution for Business Process-Oriented Data Analysis

Authors: M. Klimavicius

Abstract:

This paper proposes a modeling methodology for the development of data analysis solution. The Author introduce the approach to address data warehousing issues at the at enterprise level. The methodology covers the process of the requirements eliciting and analysis stage as well as initial design of data warehouse. The paper reviews extended business process model, which satisfy the needs of data warehouse development. The Author considers that the use of business process models is necessary, as it reflects both enterprise information systems and business functions, which are important for data analysis. The Described approach divides development into three steps with different detailed elaboration of models. The Described approach gives possibility to gather requirements and display them to business users in easy manner.

Keywords: Data warehouse, data analysis, business processmanagement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1354

7552 Reducing Power Consumption in Cloud Platforms using an Effective Mechanism

Authors: Shuen-Tai Wang, Chin-Hung Li, Ying-Chuan Chen

Abstract:

In recent years there has been renewal of interest in the relation between Green IT and Cloud Computing. The growing use of computers in cloud platform has caused marked energy consumption, putting negative pressure on electricity cost of cloud data center. This paper proposes an effective mechanism to reduce energy utilization in cloud computing environments. We present initial work on the integration of resource and power management that aims at reducing power consumption. Our mechanism relies on recalling virtualization services dynamically according to user-s virtualization request and temporarily shutting down the physical machines after finish in order to conserve energy. Given the estimated energy consumption, this proposed effort has the potential to positively impact power consumption. The results from the experiment concluded that energy indeed can be saved by powering off the idling physical machines in cloud platforms.

Keywords: Green IT, Cloud Computing, virtualization, power consumption.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2114

7551 The Investigation of the Possible Connections between Acculturation and the Acquisition of a Second Language on Libyan Teenage Students

Authors: Hamza M. A. Muftah

Abstract:

The study investigates the possible connections between acculturation and the acquisition of a second language on Libyan teenage students in Australia. Specifically, the study examined how various socio-psychological variables influenced English oral proficiency (oral communicative competence and native-like pronunciation) of the participants. In addition, it looked at whether or not SLA affects acculturation towards the target language group. This is achieved by analysing data obtained from semi-structured interviews and oral proficiency interviews. The present study found a definite link between the students’ acculturation process and their oral communicative competence but not native-like pronunciation. The results also provided evidence that SLL process has an impact on integration into the host society as well as the acquisition of a second language culture. Yet, it did not draw a clear conclusion with respect to how such a process affects these aspects.

Keywords: Acculturation, Native-like pronunciation, Oral communicative competence, Second language acquisition, Second language learners.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2640

7550 Towards a Secure Storage in Cloud Computing

Authors: Mohamed Elkholy, Ahmed Elfatatry

Abstract:

Cloud computing has emerged as a flexible computing paradigm that reshaped the Information Technology map. However, cloud computing brought about a number of security challenges as a result of the physical distribution of computational resources and the limited control that users have over the physical storage. This situation raises many security challenges for data integrity and confidentiality as well as authentication and access control. This work proposes a security mechanism for data integrity that allows a data owner to be aware of any modification that takes place to his data. The data integrity mechanism is integrated with an extended Kerberos authentication that ensures authorized access control. The proposed mechanism protects data confidentiality even if data are stored on an untrusted storage. The proposed mechanism has been evaluated against different types of attacks and proved its efficiency to protect cloud data storage from different malicious attacks.

Keywords: Access control, data integrity, data confidentiality, Kerberos authentication, cloud security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1723

7549 Thailand National Biodiversity Database System with webMathematica and Google Earth

Authors: W. Katsarapong, W. Srisang, K. Jaroensutasinee, M. Jaroensutasinee

Abstract:

National Biodiversity Database System (NBIDS) has been developed for collecting Thai biodiversity data. The goal of this project is to provide advanced tools for querying, analyzing, modeling, and visualizing patterns of species distribution for researchers and scientists. NBIDS data record two types of datasets: biodiversity data and environmental data. Biodiversity data are specie presence data and species status. The attributes of biodiversity data can be further classified into two groups: universal and projectspecific attributes. Universal attributes are attributes that are common to all of the records, e.g. X/Y coordinates, year, and collector name. Project-specific attributes are attributes that are unique to one or a few projects, e.g., flowering stage. Environmental data include atmospheric data, hydrology data, soil data, and land cover data collecting by using GLOBE protocols. We have developed webbased tools for data entry. Google Earth KML and ArcGIS were used as tools for map visualization. webMathematica was used for simple data visualization and also for advanced data analysis and visualization, e.g., spatial interpolation, and statistical analysis. NBIDS will be used by park rangers at Khao Nan National Park, and researchers.

Keywords: GLOBE protocol, Biodiversity, Database System, ArcGIS, Google Earth and webMathematica.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1929

7548 Evaluation of Clustering Based on Preprocessing in Gene Expression Data

Authors: Seo Young Kim, Toshimitsu Hamasaki

Abstract:

Microarrays have become the effective, broadly used tools in biological and medical research to address a wide range of problems, including classification of disease subtypes and tumors. Many statistical methods are available for analyzing and systematizing these complex data into meaningful information, and one of the main goals in analyzing gene expression data is the detection of samples or genes with similar expression patterns. In this paper, we express and compare the performance of several clustering methods based on data preprocessing including strategies of normalization or noise clearness. We also evaluate each of these clustering methods with validation measures for both simulated data and real gene expression data. Consequently, clustering methods which are common used in microarray data analysis are affected by normalization and degree of noise and clearness for datasets.

Keywords: Gene expression, clustering, data preprocessing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1697

7547 Economic Factors Affecting Rice Export of Thailand

Authors: Somphoom Sawaengkun

Abstract:

The purpose of this study was primarily assessing how important economic factors namely: The Thai export price of white rice, the exchange rate, and the world rice consumption affect the overall Thai white rice export, using historical data during the period 1989-2013 from the Thai Rice Exporters Association, and Food and Agricultural Organization of the United Nations. The co-integration method, regression analysis, and error correction model were applied to investigate the econometric model. The findings indicated that in the long-run, the world rice consumption, the exchange rate, and the Thai export price of white rice were the important factors affecting the export quantity of Thai white rice respectively, as indicated by their significant coefficients. Meanwhile, the rice export price was an important factor affecting the export quantity of Thai white rice in the short-run. This information is useful in the business, export opportunities, price competitiveness, and policymaker in Thailand.

Keywords: Economic Factors, Rice Export, White Rice.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3433

7546 A Hybrid Recommendation System Based On Association Rules

Authors: Ahmed Mohammed K. Alsalama

Abstract:

Recommendation systems are widely used in e-commerce applications. The engine of a current recommendation system recommends items to a particular user based on user preferences and previous high ratings. Various recommendation schemes such as collaborative filtering and content-based approaches are used to build a recommendation system. Most of current recommendation systems were developed to fit a certain domain such as books, articles, and movies. We propose1 a hybrid framework recommendation system to be applied on two dimensional spaces (User × Item) with a large number of Users and a small number of Items. Moreover, our proposed framework makes use of both favorite and non-favorite items of a particular user. The proposed framework is built upon the integration of association rules mining and the content-based approach. The results of experiments show that our proposed framework can provide accurate recommendations to users.

Keywords: Data Mining, Association Rules, Recommendation Systems, Hybrid Systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3941

7545 RV-YOLOX: Object Detection on Inland Waterways Based on Optimized YOLOX through Fusion of Vision and 3+1D Millimeter Wave Radar

Authors: Zixian Zhang, Shanliang Yao, Zile Huang, Zhaodong Wu, Xiaohui Zhu, Yong Yue, Jieming Ma

Abstract:

Unmanned Surface Vehicles (USVs) hold significant value for their capacity to undertake hazardous and labor-intensive operations over aquatic environments. Object detection tasks are significant in these applications. Nonetheless, the efficacy of USVs in object detection is impeded by several intrinsic challenges, including the intricate dispersal of obstacles, reflections emanating from coastal structures, and the presence of fog over water surfaces, among others. To address these problems, this paper provides a fusion method for USVs to effectively detect objects in the inland surface environment, utilizing vision sensors and 3+1D Millimeter-wave radar. The MMW radar is a complementary tool to vision sensors, offering reliable environmental data. This approach involves the conversion of the radar’s 3D point cloud into a 2D radar pseudo-image, thereby standardizing the format for radar and vision data by leveraging a point transformer. Furthermore, this paper proposes the development of a multi-source object detection network, named RV-YOLOX, which leverages radar-vision integration specifically tailored for inland waterway environments. The performance is evaluated on our self-recording waterways dataset. Compared with the YOLOX network, our fusion network significantly improves detection accuracy, especially for objects with bad light conditions.

Keywords: Inland waterways, object detection, YOLO, sensor fusion, self-attention, deep learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 107

7544 Product-Based Industrial Information Systems (Application to the Steel Industry)

Authors: Daniel F. Garcia, Diego Gonzalez

Abstract:

This paper shows a simple and effective approach to the design and implementation of Industrial Information Systems (IIS) oriented to control the characteristics of each individual product manufactured in a production line and also their manufacturing conditions. The particular products considered in this work are large steel strips that are coiled just after their manufacturing. However, the approach is directly applicable to coiled strips in other industries, like paper, textile, aluminum, etc. These IIS provide very detailed information of each manufactured product, which complement the general information managed by the ERP system of the production line. In spite of the high importance of this type of IIS to guarantee and improve the quality of the products manufactured in many industries, there are very few works about them in the technical literature. For this reason, this paper represents an important contribution to the development of this type of IIS, providing guidelines for their design, implementation and exploitation.

Keywords: Data storage, industrial information systems, measurement systems integration, signal acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1392

7543 Addressing Data Security in the Cloud

Authors: Marinela Mircea

Abstract:

The development of information and communication technology, the increased use of the internet, as well as the effects of the recession within the last years, have lead to the increased use of cloud computing based solutions, also called on-demand solutions. These solutions offer a large number of benefits to organizations as well as challenges and risks, mainly determined by data visualization in different geographic locations on the internet. As far as the specific risks of cloud environment are concerned, data security is still considered a peak barrier in adopting cloud computing. The present study offers an approach upon ensuring the security of cloud data, oriented towards the whole data life cycle. The final part of the study focuses on the assessment of data security in the cloud, this representing the bases in determining the potential losses and the premise for subsequent improvements and continuous learning.

Keywords: cloud computing, data life cycle, data security, security assessment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2104

7542 A Network Traffic Prediction Algorithm Based On Data Mining Technique

Authors: D. Prangchumpol

Abstract:

This paper is a description approach to predict incoming and outgoing data rate in network system by using association rule discover, which is one of the data mining techniques. Information of incoming and outgoing data in each times and network bandwidth are network performance parameters, which needed to solve in the traffic problem. Since congestion and data loss are important network problems. The result of this technique can predicted future network traffic. In addition, this research is useful for network routing selection and network performance improvement.

Keywords: Traffic prediction, association rule, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3613

7541 Fuzzy Processing of Uncertain Data

Authors: Petr Morávek, Miloš Šeda

Abstract:

In practice, we often come across situations where it is necessary to make decisions based on incomplete or uncertain data. In control systems it may be due to the unknown exact mathematical model, or its excessive complexity (e.g. nonlinearity) when it is necessary to simplify it, respectively, to solve it using a rule base. In the case of databases, searching data we compare a similarity measure with of the requirements of the selection with stored data, where both the select query and the data itself may contain vague terms, for example in the form of linguistic qualifiers. In this paper, we focus on the processing of uncertain data in databases and demonstrate it on the example multi-criteria decision making in the selection of variants, specified by higher number of technical parameters.

Keywords: fuzzy logic, linguistic variable, multicriteria decision

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1372

7540 Automated Stereophotogrammetry Data Cleansing

Authors: Stuart Henry, Philip Morrow, John Winder, Bryan Scotney

Abstract:

The stereophotogrammetry modality is gaining more widespread use in the clinical setting. Registration and visualization of this data, in conjunction with conventional 3D volumetric image modalities, provides virtual human data with textured soft tissue and internal anatomical and structural information. In this investigation computed tomography (CT) and stereophotogrammetry data is acquired from 4 anatomical phantoms and registered using the trimmed iterative closest point (TrICP) algorithm. This paper fully addresses the issue of imaging artifacts around the stereophotogrammetry surface edge using the registered CT data as a reference. Several iterative algorithms are implemented to automatically identify and remove stereophotogrammetry surface edge outliers, improving the overall visualization of the combined stereophotogrammetry and CT data. This paper shows that outliers at the surface edge of stereophotogrammetry data can be successfully removed automatically.

Keywords: Data cleansing, stereophotogrammetry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1794

7539 An Improved Data Mining Method Applied to the Search of Relationship between Metabolic Syndrome and Lifestyles

Authors: Yi Chao Huang, Yu Ling Liao, Chiu Shuang Lin

Abstract:

A data cutting and sorting method (DCSM) is proposed to optimize the performance of data mining. DCSM reduces the calculation time by getting rid of redundant data during the data mining process. In addition, DCSM minimizes the computational units by splitting the database and by sorting data with support counts. In the process of searching for the relationship between metabolic syndrome and lifestyles with the health examination database of an electronics manufacturing company, DCSM demonstrates higher search efficiency than the traditional Apriori algorithm in tests with different support counts.

Keywords: Data mining, Data cutting and sorting method, Apriori algorithm, Metabolic syndrome

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1548

7538 Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems

Authors: Mais Haj Qasem, Maen M. Al Assaf, Ali Rodan

Abstract:

Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.

Keywords: Data mining, hybrid storage system, recurrent neural network, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1690

7537 Association Rules Mining and NOSQL Oriented Document in Big Data

Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub

Abstract:

Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.

Keywords: Apriori, Association rules mining, Big Data, data mining, Hadoop, Map Reduce, MongoDB, NoSQL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 622

7536 Nonlinear Dynamic Analysis of Base-Isolated Structures Using a Partitioned Solution Approach and an Exponential Model

Authors: Nicolò Vaiana, Filip C. Filippou, Giorgio Serino

Abstract:

The solution of the nonlinear dynamic equilibrium equations of base-isolated structures adopting a conventional monolithic solution approach, i.e. an implicit single-step time integration method employed with an iteration procedure, and the use of existing nonlinear analytical models, such as differential equation models, to simulate the dynamic behavior of seismic isolators can require a significant computational effort. In order to reduce numerical computations, a partitioned solution method and a one dimensional nonlinear analytical model are presented in this paper. A partitioned solution approach can be easily applied to base-isolated structures in which the base isolation system is much more flexible than the superstructure. Thus, in this work, the explicit conditionally stable central difference method is used to evaluate the base isolation system nonlinear response and the implicit unconditionally stable Newmark’s constant average acceleration method is adopted to predict the superstructure linear response with the benefit in avoiding iterations in each time step of a nonlinear dynamic analysis. The proposed mathematical model is able to simulate the dynamic behavior of seismic isolators without requiring the solution of a nonlinear differential equation, as in the case of widely used differential equation model. The proposed mixed explicit-implicit time integration method and nonlinear exponential model are adopted to analyze a three dimensional seismically isolated structure with a lead rubber bearing system subjected to earthquake excitation. The numerical results show the good accuracy and the significant computational efficiency of the proposed solution approach and analytical model compared to the conventional solution method and mathematical model adopted in this work. Furthermore, the low stiffness value of the base isolation system with lead rubber bearings allows to have a critical time step considerably larger than the imposed ground acceleration time step, thus avoiding stability problems in the proposed mixed method.

Keywords: Base-isolated structures, earthquake engineering, mixed time integration, nonlinear exponential model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1521