Search results for: data selection
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8145

Search results for: data selection

7605 Efficient Lossless Compression of Weather Radar Data

Authors: Wei-hua Ai, Wei Yan, Xiang Li

Abstract:

Data compression is used operationally to reduce bandwidth and storage requirements. An efficient method for achieving lossless weather radar data compression is presented. The characteristics of the data are taken into account and the optical linear prediction is used for the PPI images in the weather radar data in the proposed method. The next PPI image is identical to the current one and a dramatic reduction in source entropy is achieved by using the prediction algorithm. Some lossless compression methods are used to compress the predicted data. Experimental results show that for the weather radar data, the method proposed in this paper outperforms the other methods.

Keywords: Lossless compression, weather radar data, optical linear prediction, PPI image

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2251
7604 Conceptualizing the Knowledge to Manage and Utilize Data Assets in the Context of Digitization: Case Studies of Multinational Industrial Enterprises

Authors: Martin Böhmer, Agatha Dabrowski, Boris Otto

Abstract:

The trend of digitization significantly changes the role of data for enterprises. Data turn from an enabler to an intangible organizational asset that requires management and qualifies as a tradeable good. The idea of a networked economy has gained momentum in the data domain as collaborative approaches for data management emerge. Traditional organizational knowledge consequently needs to be extended by comprehensive knowledge about data. The knowledge about data is vital for organizations to ensure that data quality requirements are met and data can be effectively utilized and sovereignly governed. As this specific knowledge has been paid little attention to so far by academics, the aim of the research presented in this paper is to conceptualize it by proposing a “data knowledge model”. Relevant model entities have been identified based on a design science research (DSR) approach that iteratively integrates insights of various industry case studies and literature research.

Keywords: Data management, digitization, Industry 4.0, knowledge engineering, metamodel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1455
7603 Feature Analysis of Predictive Maintenance Models

Authors: Zhaoan Wang

Abstract:

Research in predictive maintenance modeling has improved in the recent years to predict failures and needed maintenance with high accuracy, saving cost and improving manufacturing efficiency. However, classic prediction models provide little valuable insight towards the most important features contributing to the failure. By analyzing and quantifying feature importance in predictive maintenance models, cost saving can be optimized based on business goals. First, multiple classifiers are evaluated with cross-validation to predict the multi-class of failures. Second, predictive performance with features provided by different feature selection algorithms are further analyzed. Third, features selected by different algorithms are ranked and combined based on their predictive power. Finally, linear explainer SHAP (SHapley Additive exPlanations) is applied to interpret classifier behavior and provide further insight towards the specific roles of features in both local predictions and global model behavior. The results of the experiments suggest that certain features play dominant roles in predictive models while others have significantly less impact on the overall performance. Moreover, for multi-class prediction of machine failures, the most important features vary with type of machine failures. The results may lead to improved productivity and cost saving by prioritizing sensor deployment, data collection, and data processing of more important features over less importance features.

Keywords: Automated supply chain, intelligent manufacturing, predictive maintenance machine learning, feature engineering, model interpretation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1996
7602 A Methodology for Data Migration between Different Database Management Systems

Authors: Bogdan Walek, Cyril Klimes

Abstract:

In present days the area of data migration is very topical. Current tools for data migration in the area of relational database have several disadvantages that are presented in this paper. We propose a methodology for data migration of the database tables and their data between various types of relational database systems (RDBMS). The proposed methodology contains an expert system. The expert system contains a knowledge base that is composed of IFTHEN rules and based on the input data suggests appropriate data types of columns of database tables. The proposed tool, which contains an expert system, also includes the possibility of optimizing the data types in the target RDBMS database tables based on processed data of the source RDBMS database tables. The proposed expert system is shown on data migration of selected database of the source RDBMS to the target RDBMS.

Keywords: Expert system, fuzzy, data migration, database, relational database, data type, relational database management system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3485
7601 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Authors: K. Hardy, A. Maurushat

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: Big data, open data, productivity, transparency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1627
7600 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data

Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin

Abstract:

Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.

Keywords: Big data, correlation analysis, data recommendation system, urban data network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1100
7599 On the Combination of Patient-Generated Data with Data from a Secure Clinical Network Environment – A Practical Example

Authors: Jeroen S. de Bruin, Karin Schindler, Christian Schuh

Abstract:

With increasingly more mobile health applications appearing due to the popularity of smartphones, the possibility arises that these data can be used to improve the medical diagnostic process, as well as the overall quality of healthcare, while at the same time lowering costs. However, as of yet there have been no reports of a successful combination of patient-generated data from smartphones with data from clinical routine. In this paper we describe how these two types of data can be combined in a secure way without modification to hospital information systems, and how they can together be used in a medical expert system for automatic nutritional classification and triage.

Keywords: Data integration, disease-related malnutrition, expert systems, mobile health.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2194
7598 Applying Case-Based Reasoning in Supporting Strategy Decisions

Authors: S. M. Seyedhosseini, A. Makui, M. Ghadami

Abstract:

Globalization and therefore increasing tight competition among companies, have resulted to increase the importance of making well-timed decision. Devising and employing effective strategies, that are flexible and adaptive to changing market, stand a greater chance of being effective in the long-term. In other side, a clear focus on managing the entire product lifecycle has emerged as critical areas for investment. Therefore, applying wellorganized tools to employ past experience in new case, helps to make proper and managerial decisions. Case based reasoning (CBR) is based on a means of solving a new problem by using or adapting solutions to old problems. In this paper, an adapted CBR model with k-nearest neighbor (K-NN) is employed to provide suggestions for better decision making which are adopted for a given product in the middle of life phase. The set of solutions are weighted by CBR in the principle of group decision making. Wrapper approach of genetic algorithm is employed to generate optimal feature subsets. The dataset of the department store, including various products which are collected among two years, have been used. K-fold approach is used to evaluate the classification accuracy rate. Empirical results are compared with classical case based reasoning algorithm which has no special process for feature selection, CBR-PCA algorithm based on filter approach feature selection, and Artificial Neural Network. The results indicate that the predictive performance of the model, compare with two CBR algorithms, in specific case is more effective.

Keywords: Case based reasoning, Genetic algorithm, Groupdecision making, Product management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2170
7597 The Effects of Weather Anomalies on the Quantitative and Qualitative Parameters of Maize Hybrids of Different Genetic Traits in Hungary

Authors: Zs. J. Becze, Á. Krivián, M. Sárvári

Abstract:

Hybrid selection and the application of hybrid specific production technologies are important in terms of the increase of the yield and crop safety of maize. The main explanation for this is climate change, since weather extremes are going on and seem to accelerate in Hungary too.

The biological bases, the selection of appropriate hybrids will be of greater importance in the future. The issue of the adaptability of hybrids will be considerably appreciated. Its good agronomical traits and stress bearing against climatic factors and agrotechnical elements (e.g. different types of herbicides) will be important. There have been examples of 3-4 consecutive droughty years in the past decades, e.g. 1992-1993-1994 or 2009-2011-2012, which made the results of crop production critical. Irrigation cannot be the solution for the problem since currently only the 2% of the arable land is irrigated. Temperatures exceeding the multi-year average are characteristic mainly to the July and August in Hungary, which significantly increase the soil surface evaporation, thus further enhance water shortage. In terms of the yield and crop safety of maize, the weather of these two months is crucial, since the extreme high temperature in July decreases the viability of the pollen and the pistil of maize, decreases the extent of fertilization and makes grain-filling tardy. Consequently, yield and crop safety decrease.

Keywords: Abiotic factors, drought, nutrition content, yield.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1896
7596 Comparison of Imputation Techniques for Efficient Prediction of Software Fault Proneness in Classes

Authors: Geeta Sikka, Arvinder Kaur Takkar, Moin Uddin

Abstract:

Missing data is a persistent problem in almost all areas of empirical research. The missing data must be treated very carefully, as data plays a fundamental role in every analysis. Improper treatment can distort the analysis or generate biased results. In this paper, we compare and contrast various imputation techniques on missing data sets and make an empirical evaluation of these methods so as to construct quality software models. Our empirical study is based on NASA-s two public dataset. KC4 and KC1. The actual data sets of 125 cases and 2107 cases respectively, without any missing values were considered. The data set is used to create Missing at Random (MAR) data Listwise Deletion(LD), Mean Substitution(MS), Interpolation, Regression with an error term and Expectation-Maximization (EM) approaches were used to compare the effects of the various techniques.

Keywords: Missing data, Imputation, Missing Data Techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1662
7595 Cluster Analysis for the Statistical Modeling of Aesthetic Judgment Data Related to Comics Artists

Authors: George E. Tsekouras, Evi Sampanikou

Abstract:

We compare three categorical data clustering algorithms with respect to the problem of classifying cultural data related to the aesthetic judgment of comics artists. Such a classification is very important in Comics Art theory since the determination of any classes of similarities in such kind of data will provide to art-historians very fruitful information of Comics Art-s evolution. To establish this, we use a categorical data set and we study it by employing three categorical data clustering algorithms. The performances of these algorithms are compared each other, while interpretations of the clustering results are also given.

Keywords: Aesthetic judgment, comics artists, cluster analysis, categorical data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1631
7594 IoT Device Cost Effective Storage Architecture and Real-Time Data Analysis/Data Privacy Framework

Authors: Femi Elegbeleye, Seani Rananga

Abstract:

This paper focused on cost effective storage architecture using fog and cloud data storage gateway, and presented the design of the framework for the data privacy model and data analytics framework on a real-time analysis when using machine learning method. The paper began with the system analysis, system architecture and its component design, as well as the overall system operations. Several results obtained from this study on data privacy models show that when two or more data privacy models are integrated via a fog storage gateway, we often have more secure data. Our main focus in the study is to design a framework for the data privacy model, data storage, and real-time analytics. This paper also shows the major system components and their framework specification. And lastly, the overall research system architecture was shown, including its structure, and its interrelationships.

Keywords: IoT, fog storage, cloud storage, data analysis, data privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 227
7593 Utility Assessment Model for Wireless Technology in Construction

Authors: Y. Abdelrazig, A. Ghanem

Abstract:

Construction projects are information intensive in nature and involve many activities that are related to each other. Wireless technologies can be used to improve the accuracy and timeliness of data collected from construction sites and shares it with appropriate parties. Nonetheless, the construction industry tends to be conservative and shows hesitation to adopt new technologies. A main concern for owners, contractors or any person in charge on a job site is the cost of the technology in question. Wireless technologies are not cheap. There are a lot of expenses to be taken into consideration, and a study should be completed to make sure that the importance and savings resulting from the usage of this technology is worth the expenses. This research attempts to assess the effectiveness of using the appropriate wireless technologies based on criteria such as performance, reliability, and risk. The assessment is based on a utility function model that breaks down the selection issue into alternatives attribute. Then the attributes are assigned weights and single attributes are measured. Finally, single attribute are combined to develop one single aggregate utility index for each alternative.

Keywords: Analytic Hierarchy Process, Utility Function, Wireless Technologies, construction management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1919
7592 A Novel Architecture for Wavelet based Image Fusion

Authors: Susmitha Vekkot, Pancham Shukla

Abstract:

In this paper, we focus on the fusion of images from different sources using multiresolution wavelet transforms. Based on reviews of popular image fusion techniques used in data analysis, different pixel and energy based methods are experimented. A novel architecture with a hybrid algorithm is proposed which applies pixel based maximum selection rule to low frequency approximations and filter mask based fusion to high frequency details of wavelet decomposition. The key feature of hybrid architecture is the combination of advantages of pixel and region based fusion in a single image which can help the development of sophisticated algorithms enhancing the edges and structural details. A Graphical User Interface is developed for image fusion to make the research outcomes available to the end user. To utilize GUI capabilities for medical, industrial and commercial activities without MATLAB installation, a standalone executable application is also developed using Matlab Compiler Runtime.

Keywords: Filter mask, GUI, hybrid architecture, image fusion, Matlab Compiler Runtime, wavelet transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2384
7591 The Impact of System and Data Quality on Organizational Success in the Kingdom of Bahrain

Authors: Amal M. Alrayes

Abstract:

Data and system quality play a central role in organizational success, and the quality of any existing information system has a major influence on the effectiveness of overall system performance. Given the importance of system and data quality to an organization, it is relevant to highlight their importance on organizational performance in the Kingdom of Bahrain. This research aims to discover whether system quality and data quality are related, and to study the impact of system and data quality on organizational success. A theoretical model based on previous research is used to show the relationship between data and system quality, and organizational impact. We hypothesize, first, that system quality is positively associated with organizational impact, secondly that system quality is positively associated with data quality, and finally that data quality is positively associated with organizational impact. A questionnaire was conducted among public and private organizations in the Kingdom of Bahrain. The results show that there is a strong association between data and system quality, that affects organizational success.

Keywords: Data quality, performance, system quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2113
7590 Hydrodynamic Modeling of a Surface Water Treatment Pilot Plant

Authors: C.-M. Militaru, A. Pǎcalǎ, I. Vlaicu, K. Bodor, G.-A. Dumitrel, T. Todinca

Abstract:

A mathematical model for the hydrodynamics of a surface water treatment pilot plant was developed and validated by the determination of the residence time distribution (RTD) for the main equipments of the unit. The well known models of ideal/real mixing, ideal displacement (plug flow) and (one-dimensional axial) dispersion model were combined in order to identify the structure that gives the best fitting of the experimental data for each equipment of the pilot plant. RTD experimental results have shown that pilot plant hydrodynamics can be quite well approximated by a combination of simple mathematical models, structure which is suitable for engineering applications. Validated hydrodynamic models will be further used in the evaluation and selection of the most suitable coagulation-flocculation reagents, optimum operating conditions (injection point, reaction times, etc.), in order to improve the quality of the drinking water.

Keywords: drinking water, hydrodynamic modeling, pilot plant, residence time distribution, surface water.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1669
7589 Integration of Multi-Source Data to Monitor Coral Biodiversity

Authors: K. Jitkue, W. Srisang, C. Yaiprasert, K. Jaroensutasinee, M. Jaroensutasinee

Abstract:

This study aims at using multi-source data to monitor coral biodiversity and coral bleaching. We used coral reef at Racha Islands, Phuket as a study area. There were three sources of data: coral diversity, sensor based data and satellite data.

Keywords: Coral reefs, Remote sensing, Sea surfacetemperatue, Satellite imagery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1549
7588 Decision Support System Based on Data Warehouse

Authors: Yang Bao, LuJing Zhang

Abstract:

Typical Intelligent Decision Support System is 4-based, its design composes of Data Warehouse, Online Analytical Processing, Data Mining and Decision Supporting based on models, which is called Decision Support System Based on Data Warehouse (DSSBDW). This way takes ETL,OLAP and DM as its implementing means, and integrates traditional model-driving DSS and data-driving DSS into a whole. For this kind of problem, this paper analyzes the DSSBDW architecture and DW model, and discusses the following key issues: ETL designing and Realization; metadata managing technology using XML; SQL implementing, optimizing performance, data mapping in OLAP; lastly, it illustrates the designing principle and method of DW in DSSBDW.

Keywords: Decision Support System, Data Warehouse, Data Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3857
7587 A New History Based Method to Handle the Recurring Concept Shifts in Data Streams

Authors: Hossein Morshedlou, Ahmad Abdollahzade Barforoush

Abstract:

Recent developments in storage technology and networking architectures have made it possible for broad areas of applications to rely on data streams for quick response and accurate decision making. Data streams are generated from events of real world so existence of associations, which are among the occurrence of these events in real world, among concepts of data streams is logical. Extraction of these hidden associations can be useful for prediction of subsequent concepts in concept shifting data streams. In this paper we present a new method for learning association among concepts of data stream and prediction of what the next concept will be. Knowing the next concept, an informed update of data model will be possible. The results of conducted experiments show that the proposed method is proper for classification of concept shifting data streams.

Keywords: Data Stream, Classification, Concept Shift, History.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1276
7586 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: Text mining, topic extraction, independent, incremental, independent component analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1051
7585 Bandwidth Estimation Algorithms for the Dynamic Adaptation of Voice Codec

Authors: Davide Pierattoni, Ivan Macor, Pier Luca Montessoro

Abstract:

In the recent years multimedia traffic and in particular VoIP services are growing dramatically. We present a new algorithm to control the resource utilization and to optimize the voice codec selection during SIP call setup on behalf of the traffic condition estimated on the network path. The most suitable methodologies and the tools that perform realtime evaluation of the available bandwidth on a network path have been integrated with our proposed algorithm: this selects the best codec for a VoIP call in function of the instantaneous available bandwidth on the path. The algorithm does not require any explicit feedback from the network, and this makes it easily deployable over the Internet. We have also performed intensive tests on real network scenarios with a software prototype, verifying the algorithm efficiency with different network topologies and traffic patterns between two SIP PBXs. The promising results obtained during the experimental validation of the algorithm are now the basis for the extension towards a larger set of multimedia services and the integration of our methodology with existing PBX appliances.

Keywords: Integrated voice-data communication, computernetwork performance, resource optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1687
7584 Automatic Staging and Subtype Determination for Non-Small Cell Lung Carcinoma Using PET Image Texture Analysis

Authors: Seyhan Karaçavuş, Bülent Yılmaz, Ömer Kayaaltı, Semra İçer, Arzu Taşdemir, Oğuzhan Ayyıldız, Kübra Eset, Eser Kaya

Abstract:

In this study, our goal was to perform tumor staging and subtype determination automatically using different texture analysis approaches for a very common cancer type, i.e., non-small cell lung carcinoma (NSCLC). Especially, we introduced a texture analysis approach, called Law’s texture filter, to be used in this context for the first time. The 18F-FDG PET images of 42 patients with NSCLC were evaluated. The number of patients for each tumor stage, i.e., I-II, III or IV, was 14. The patients had ~45% adenocarcinoma (ADC) and ~55% squamous cell carcinoma (SqCCs). MATLAB technical computing language was employed in the extraction of 51 features by using first order statistics (FOS), gray-level co-occurrence matrix (GLCM), gray-level run-length matrix (GLRLM), and Laws’ texture filters. The feature selection method employed was the sequential forward selection (SFS). Selected textural features were used in the automatic classification by k-nearest neighbors (k-NN) and support vector machines (SVM). In the automatic classification of tumor stage, the accuracy was approximately 59.5% with k-NN classifier (k=3) and 69% with SVM (with one versus one paradigm), using 5 features. In the automatic classification of tumor subtype, the accuracy was around 92.7% with SVM one vs. one. Texture analysis of FDG-PET images might be used, in addition to metabolic parameters as an objective tool to assess tumor histopathological characteristics and in automatic classification of tumor stage and subtype.

Keywords: Cancer stage, cancer cell type, non-small cell lung carcinoma, PET, texture analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 971
7583 Analyzing the Effect of Materials’ Selection on Energy Saving and Carbon Footprint: A Case Study Simulation of Concrete Structure Building

Authors: M. Kouhirostamkolaei, M. Kouhirostami, M. Sam, J. Woo, A. T. Asutosh, J. Li, C. Kibert

Abstract:

Construction is one of the most energy consumed activities in the urban environment that results in a significant amount of greenhouse gas emissions around the world. Thus, the impact of the construction industry on global warming is undeniable. Thus, reducing building energy consumption and mitigating carbon production can slow the rate of global warming. The purpose of this study is to determine the amount of energy consumption and carbon dioxide production during the operation phase and the impact of using new shells on energy saving and carbon footprint. Therefore, a residential building with a re-enforced concrete structure is selected in Babolsar, Iran. DesignBuilder software has been used for one year of building operation to calculate the amount of carbon dioxide production and energy consumption in the operation phase of the building. The primary results show the building use 61750 kWh of energy each year. Computer simulation analyzes the effect of changing building shells -using XPS polystyrene and new electrochromic windows- as well as changing the type of lighting on energy consumption reduction and subsequent carbon dioxide production. The results show that the amount of energy and carbon production during building operation has been reduced by approximately 70% by applying the proposed changes. The changes reduce CO2e to 11345 kg CO2/yr. The result of this study helps designers and engineers to consider material selection’s process as one of the most important stages of design for improving energy performance of buildings.

Keywords: Construction materials, green construction, energy simulation, carbon footprint, energy saving, concrete structure, DesignBuilder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 978
7582 A Framework for Data Mining Based Multi-Agent: An Application to Spatial Data

Authors: H. Baazaoui Zghal, S. Faiz, H. Ben Ghezala

Abstract:

Data mining is an extraordinarily demanding field referring to extraction of implicit knowledge and relationships, which are not explicitly stored in databases. A wide variety of methods of data mining have been introduced (classification, characterization, generalization...). Each one of these methods includes more than algorithm. A system of data mining implies different user categories,, which mean that the user-s behavior must be a component of the system. The problem at this level is to know which algorithm of which method to employ for an exploratory end, which one for a decisional end, and how can they collaborate and communicate. Agent paradigm presents a new way of conception and realizing of data mining system. The purpose is to combine different algorithms of data mining to prepare elements for decision-makers, benefiting from the possibilities offered by the multi-agent systems. In this paper the agent framework for data mining is introduced, and its overall architecture and functionality are presented. The validation is made on spatial data. Principal results will be presented.

Keywords: Databases, data mining, multi-agent, spatial datamart.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2038
7581 Latent Topic Based Medical Data Classification

Authors: Jian-hua Yeh, Shi-yi Kuo

Abstract:

This paper discusses the classification process for medical data. In this paper, we use the data from ACM KDDCup 2008 to demonstrate our classification process based on latent topic discovery. In this data set, the target set and outliers are quite different in their nature: target set is only 0.6% size in total, while the outliers consist of 99.4% of the data set. We use this data set as an example to show how we dealt with this extremely biased data set with latent topic discovery and noise reduction techniques. Our experiment faces two major challenge: (1) extremely distributed outliers, and (2) positive samples are far smaller than negative ones. We try to propose a suitable process flow to deal with these issues and get a best AUC result of 0.98.

Keywords: classification, latent topics, outlier adjustment, feature scaling

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1637
7580 Data Collection in Hospital Emergencies: A Questionnaire Survey

Authors: Nouha Mhimdi, Wahiba Ben Abdessalem Karaa, Henda Ben Ghezala

Abstract:

Many methods are used to collect data like questionnaires, surveys, focus group interviews. Or the collection of poor-quality data resulting, for example, from poorly designed questionnaires, the absence of good translators or interpreters, and the incorrect recording of data allow conclusions to be drawn that are not supported by the data or to focus only on the average effect of the program or policy. There are several solutions to avoid or minimize the most frequent errors, including obtaining expert advice on the design or adaptation of data collection instruments; or use technologies allowing better "anonymity" in the responses. In this context, and to overcome the aforementioned problems, we suggest in this paper an approach to achieve the collection of relevant data, by carrying out a large-scale questionnaire-based survey. We have been able to collect good quality, consistent and practical data on hospital emergencies to improve emergency services in hospitals, especially in the case of epidemics or pandemics.

Keywords: Data collection, survey, database, data analysis, hospital emergencies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 649
7579 Data Transformation Services (DTS): Creating Data Mart by Consolidating Multi-Source Enterprise Operational Data

Authors: J. D. D. Daniel, K. N. Goh, S. M. Yusop

Abstract:

Trends in business intelligence, e-commerce and remote access make it necessary and practical to store data in different ways on multiple systems with different operating systems. As business evolve and grow, they require efficient computerized solution to perform data update and to access data from diverse enterprise business applications. The objective of this paper is to demonstrate the capability of DTS [1] as a database solution for automatic data transfer and update in solving business problem. This DTS package is developed for the sales of variety of plants and eventually expanded into commercial supply and landscaping business. Dimension data modeling is used in DTS package to extract, transform and load data from heterogeneous database systems such as MySQL, Microsoft Access and Oracle that consolidates into a Data Mart residing in SQL Server. Hence, the data transfer from various databases is scheduled to run automatically every quarter of the year to review the efficient sales analysis. Therefore, DTS is absolutely an attractive solution for automatic data transfer and update which meeting today-s business needs.

Keywords: Data Transformation Services (DTS), ObjectLinking and Embedding Database (OLEDB), Data Mart, OnlineAnalytical Processing (OLAP), Online Transactional Processing(OLTP).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2029
7578 Extraction of Data from Web Pages: A Vision Based Approach

Authors: P. S. Hiremath, Siddu P. Algur

Abstract:

With the explosive growth of information sources available on the World Wide Web, it has become increasingly difficult to identify the relevant pieces of information, since web pages are often cluttered with irrelevant content like advertisements, navigation-panels, copyright notices etc., surrounding the main content of the web page. Hence, tools for the mining of data regions, data records and data items need to be developed in order to provide value-added services. Currently available automatic techniques to mine data regions from web pages are still unsatisfactory because of their poor performance and tag-dependence. In this paper a novel method to extract data items from the web pages automatically is proposed. It comprises of two steps: (1) Identification and Extraction of the data regions based on visual clues information. (2) Identification of data records and extraction of data items from a data region. For step1, a novel and more effective method is proposed based on visual clues, which finds the data regions formed by all types of tags using visual clues. For step2 a more effective method namely, Extraction of Data Items from web Pages (EDIP), is adopted to mine data items. The EDIP technique is a list-based approach in which the list is a linear data structure. The proposed technique is able to mine the non-contiguous data records and can correctly identify data regions, irrespective of the type of tag in which it is bound. Our experimental results show that the proposed technique performs better than the existing techniques.

Keywords: Web data records, web data regions, web mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1895
7577 Visual-Graphical Methods for Exploring Longitudinal Data

Authors: H. W. Ker

Abstract:

Longitudinal data typically have the characteristics of changes over time, nonlinear growth patterns, between-subjects variability, and the within errors exhibiting heteroscedasticity and dependence. The data exploration is more complicated than that of cross-sectional data. The purpose of this paper is to organize/integrate of various visual-graphical techniques to explore longitudinal data. From the application of the proposed methods, investigators can answer the research questions include characterizing or describing the growth patterns at both group and individual level, identifying the time points where important changes occur and unusual subjects, selecting suitable statistical models, and suggesting possible within-error variance.

Keywords: Data exploration, exploratory analysis, HLMs/LMEs, longitudinal data, visual-graphical methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2090
7576 A Materialized Approach to the Integration of XML Documents: the OSIX System

Authors: H. Ahmad, S. Kermanshahani, A. Simonet, M. Simonet

Abstract:

The data exchanged on the Web are of different nature from those treated by the classical database management systems; these data are called semi-structured data since they do not have a regular and static structure like data found in a relational database; their schema is dynamic and may contain missing data or types. Therefore, the needs for developing further techniques and algorithms to exploit and integrate such data, and extract relevant information for the user have been raised. In this paper we present the system OSIX (Osiris based System for Integration of XML Sources). This system has a Data Warehouse model designed for the integration of semi-structured data and more precisely for the integration of XML documents. The architecture of OSIX relies on the Osiris system, a DL-based model designed for the representation and management of databases and knowledge bases. Osiris is a viewbased data model whose indexing system supports semantic query optimization. We show that the problem of query processing on a XML source is optimized by the indexing approach proposed by Osiris.

Keywords: Data integration, semi-structured data, views, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1582