Search results for: Data Estimation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8076

Search results for: Data Estimation

7476 Improved Estimation of Evolutionary Spectrum based on Short Time Fourier Transforms and Modified Magnitude Group Delay by Signal Decomposition

Authors: H K Lakshminarayana, J S Bhat, H M Mahesh

Abstract:

A new estimator for evolutionary spectrum (ES) based on short time Fourier transform (STFT) and modified group delay function (MGDF) by signal decomposition (SD) is proposed. The STFT due to its built-in averaging, suppresses the cross terms and the MGDF preserves the frequency resolution of the rectangular window with the reduction in the Gibbs ripple. The present work overcomes the magnitude distortion observed in multi-component non-stationary signals with STFT and MGDF estimation of ES using SD. The SD is achieved either through discrete cosine transform based harmonic wavelet transform (DCTHWT) or perfect reconstruction filter banks (PRFB). The MGDF also improves the signal to noise ratio by removing associated noise. The performance of the present method is illustrated for cross chirp and frequency shift keying (FSK) signals, which indicates that its performance is better than STFT-MGDF (STFT-GD) alone. Further its noise immunity is better than STFT. The SD based methods, however cannot bring out the frequency transition path from band to band clearly, as there will be gap in the contour plot at the transition. The PRFB based STFT-SD shows good performance than DCTHWT decomposition method for STFT-GD.

Keywords: Evolutionary Spectrum, Modified Group Delay, Discrete Cosine Transform, Harmonic Wavelet Transform, Perfect Reconstruction Filter Banks, Short Time Fourier Transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1592
7475 A Methodology for Data Migration between Different Database Management Systems

Authors: Bogdan Walek, Cyril Klimes

Abstract:

In present days the area of data migration is very topical. Current tools for data migration in the area of relational database have several disadvantages that are presented in this paper. We propose a methodology for data migration of the database tables and their data between various types of relational database systems (RDBMS). The proposed methodology contains an expert system. The expert system contains a knowledge base that is composed of IFTHEN rules and based on the input data suggests appropriate data types of columns of database tables. The proposed tool, which contains an expert system, also includes the possibility of optimizing the data types in the target RDBMS database tables based on processed data of the source RDBMS database tables. The proposed expert system is shown on data migration of selected database of the source RDBMS to the target RDBMS.

Keywords: Expert system, fuzzy, data migration, database, relational database, data type, relational database management system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3458
7474 Affine Radial Basis Function Neural Networks for the Robust Control of Hyperbolic Distributed Parameter Systems

Authors: Eleni Aggelogiannaki, Haralambos Sarimveis

Abstract:

In this work, a radial basis function (RBF) neural network is developed for the identification of hyperbolic distributed parameter systems (DPSs). This empirical model is based only on process input-output data and used for the estimation of the controlled variables at specific locations, without the need of online solution of partial differential equations (PDEs). The nonlinear model that is obtained is suitably transformed to a nonlinear state space formulation that also takes into account the model mismatch. A stable robust control law is implemented for the attenuation of external disturbances. The proposed identification and control methodology is applied on a long duct, a common component of thermal systems, for a flow based control of temperature distribution. The closed loop performance is significantly improved in comparison to existing control methodologies.

Keywords: Hyperbolic Distributed Parameter Systems, Radial Basis Function Neural Networks, H∞ control, Thermal systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1405
7473 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Authors: K. Hardy, A. Maurushat

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: Big data, open data, productivity, transparency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1605
7472 Infrastructure Change Monitoring Using Multitemporal Multispectral Satellite Images

Authors: U. Datta

Abstract:

The main objective of this study is to find a suitable approach to monitor the land infrastructure growth over a period of time using multispectral satellite images. Bi-temporal change detection method is unable to indicate the continuous change occurring over a long period of time. To achieve this objective, the approach used here estimates a statistical model from series of multispectral image data over a long period of time, assuming there is no considerable change during that time period and then compare it with the multispectral image data obtained at a later time. The change is estimated pixel-wise. Statistical composite hypothesis technique is used for estimating pixel based change detection in a defined region. The generalized likelihood ratio test (GLRT) is used to detect the changed pixel from probabilistic estimated model of the corresponding pixel. The changed pixel is detected assuming that the images have been co-registered prior to estimation. To minimize error due to co-registration, 8-neighborhood pixels around the pixel under test are also considered. The multispectral images from Sentinel-2 and Landsat-8 from 2015 to 2018 are used for this purpose. There are different challenges in this method. First and foremost challenge is to get quite a large number of datasets for multivariate distribution modelling. A large number of images are always discarded due to cloud coverage. Due to imperfect modelling there will be high probability of false alarm. Overall conclusion that can be drawn from this work is that the probabilistic method described in this paper has given some promising results, which need to be pursued further.

Keywords: Co-registration, GLRT, infrastructure growth, multispectral, multitemporal, pixel-based change detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 694
7471 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data

Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin

Abstract:

Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.

Keywords: Big data, correlation analysis, data recommendation system, urban data network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1087
7470 On the Combination of Patient-Generated Data with Data from a Secure Clinical Network Environment – A Practical Example

Authors: Jeroen S. de Bruin, Karin Schindler, Christian Schuh

Abstract:

With increasingly more mobile health applications appearing due to the popularity of smartphones, the possibility arises that these data can be used to improve the medical diagnostic process, as well as the overall quality of healthcare, while at the same time lowering costs. However, as of yet there have been no reports of a successful combination of patient-generated data from smartphones with data from clinical routine. In this paper we describe how these two types of data can be combined in a secure way without modification to hospital information systems, and how they can together be used in a medical expert system for automatic nutritional classification and triage.

Keywords: Data integration, disease-related malnutrition, expert systems, mobile health.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2185
7469 Comparison of Imputation Techniques for Efficient Prediction of Software Fault Proneness in Classes

Authors: Geeta Sikka, Arvinder Kaur Takkar, Moin Uddin

Abstract:

Missing data is a persistent problem in almost all areas of empirical research. The missing data must be treated very carefully, as data plays a fundamental role in every analysis. Improper treatment can distort the analysis or generate biased results. In this paper, we compare and contrast various imputation techniques on missing data sets and make an empirical evaluation of these methods so as to construct quality software models. Our empirical study is based on NASA-s two public dataset. KC4 and KC1. The actual data sets of 125 cases and 2107 cases respectively, without any missing values were considered. The data set is used to create Missing at Random (MAR) data Listwise Deletion(LD), Mean Substitution(MS), Interpolation, Regression with an error term and Expectation-Maximization (EM) approaches were used to compare the effects of the various techniques.

Keywords: Missing data, Imputation, Missing Data Techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1651
7468 Cluster Analysis for the Statistical Modeling of Aesthetic Judgment Data Related to Comics Artists

Authors: George E. Tsekouras, Evi Sampanikou

Abstract:

We compare three categorical data clustering algorithms with respect to the problem of classifying cultural data related to the aesthetic judgment of comics artists. Such a classification is very important in Comics Art theory since the determination of any classes of similarities in such kind of data will provide to art-historians very fruitful information of Comics Art-s evolution. To establish this, we use a categorical data set and we study it by employing three categorical data clustering algorithms. The performances of these algorithms are compared each other, while interpretations of the clustering results are also given.

Keywords: Aesthetic judgment, comics artists, cluster analysis, categorical data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1620
7467 Bandwidth Estimation Algorithms for the Dynamic Adaptation of Voice Codec

Authors: Davide Pierattoni, Ivan Macor, Pier Luca Montessoro

Abstract:

In the recent years multimedia traffic and in particular VoIP services are growing dramatically. We present a new algorithm to control the resource utilization and to optimize the voice codec selection during SIP call setup on behalf of the traffic condition estimated on the network path. The most suitable methodologies and the tools that perform realtime evaluation of the available bandwidth on a network path have been integrated with our proposed algorithm: this selects the best codec for a VoIP call in function of the instantaneous available bandwidth on the path. The algorithm does not require any explicit feedback from the network, and this makes it easily deployable over the Internet. We have also performed intensive tests on real network scenarios with a software prototype, verifying the algorithm efficiency with different network topologies and traffic patterns between two SIP PBXs. The promising results obtained during the experimental validation of the algorithm are now the basis for the extension towards a larger set of multimedia services and the integration of our methodology with existing PBX appliances.

Keywords: Integrated voice-data communication, computernetwork performance, resource optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1678
7466 Identification of Aircraft Gas Turbine Engines Temperature Condition

Authors: Pashayev A., Askerov D., C. Ardil, Sadiqov R., Abdullayev P.

Abstract:

Groundlessness of application probability-statistic methods are especially shown at an early stage of the aviation GTE technical condition diagnosing, when the volume of the information has property of the fuzzy, limitations, uncertainty and efficiency of application of new technology Soft computing at these diagnosing stages by using the fuzzy logic and neural networks methods. It is made training with high accuracy of multiple linear and nonlinear models (the regression equations) received on the statistical fuzzy data basis. At the information sufficiency it is offered to use recurrent algorithm of aviation GTE technical condition identification on measurements of input and output parameters of the multiple linear and nonlinear generalized models at presence of noise measured (the new recursive least squares method (LSM)). As application of the given technique the estimation of the new operating aviation engine D30KU-154 technical condition at height H=10600 m was made.

Keywords: Identification of a technical condition, aviation gasturbine engine, fuzzy logic and neural networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1641
7465 IoT Device Cost Effective Storage Architecture and Real-Time Data Analysis/Data Privacy Framework

Authors: Femi Elegbeleye, Seani Rananga

Abstract:

This paper focused on cost effective storage architecture using fog and cloud data storage gateway, and presented the design of the framework for the data privacy model and data analytics framework on a real-time analysis when using machine learning method. The paper began with the system analysis, system architecture and its component design, as well as the overall system operations. Several results obtained from this study on data privacy models show that when two or more data privacy models are integrated via a fog storage gateway, we often have more secure data. Our main focus in the study is to design a framework for the data privacy model, data storage, and real-time analytics. This paper also shows the major system components and their framework specification. And lastly, the overall research system architecture was shown, including its structure, and its interrelationships.

Keywords: IoT, fog storage, cloud storage, data analysis, data privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 200
7464 Identification of Aircraft Gas Turbine Engine's Temperature Condition

Authors: Pashayev A., Askerov D., C. Ardil, Sadiqov R., Abdullayev P.

Abstract:

Groundlessness of application probability-statistic methods are especially shown at an early stage of the aviation GTE technical condition diagnosing, when the volume of the information has property of the fuzzy, limitations, uncertainty and efficiency of application of new technology Soft computing at these diagnosing stages by using the fuzzy logic and neural networks methods. It is made training with high accuracy of multiple linear and nonlinear models (the regression equations) received on the statistical fuzzy data basis. At the information sufficiency it is offered to use recurrent algorithm of aviation GTE technical condition identification on measurements of input and output parameters of the multiple linear and nonlinear generalized models at presence of noise measured (the new recursive least squares method (LSM)). As application of the given technique the estimation of the new operating aviation engine D30KU-154 technical condition at height H=10600 m was made.

Keywords: Identification of a technical condition, aviation gasturbine engine, fuzzy logic and neural networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1658
7463 Impact of Vehicle Travel Characteristics on Level of Service: A Comparative Analysis of Rural and Urban Freeways

Authors: Anwaar Ahmed, Muhammad Bilal Khurshid, Samuel Labi

Abstract:

The effect of trucks on the level of service is determined by considering passenger car equivalents (PCE) of trucks. The current version of Highway Capacity Manual (HCM) uses a single PCE value for all tucks combined. However, the composition of truck traffic varies from location to location; therefore, a single PCE value for all trucks may not correctly represent the impact of truck traffic at specific locations. Consequently, present study developed separate PCE values for single-unit and combination trucks to replace the single value provided in the HCM on different freeways. Site specific PCE values, were developed using concept of spatial lagging headways (that is the distance between rear bumpers of two vehicles in a traffic stream) measured from field traffic data. The study used data from four locations on a single urban freeway and three different rural freeways in Indiana. Three-stage-leastsquares (3SLS) regression techniques were used to generate models that predicted lagging headways for passenger cars, single unit trucks (SUT), and combination trucks (CT). The estimated PCE values for single-unit and combination truck for basic urban freeways (level terrain) were: 1.35 and 1.60, respectively. For rural freeways the estimated PCE values for single-unit and combination truck were: 1.30 and 1.45, respectively. As expected, traffic variables such as vehicle flow rates and speed have significant impacts on vehicle headways. Study results revealed that the use of separate PCE values for different truck classes can have significant influence on the LOS estimation.

Keywords: Level of Service, Capacity Analysis, Lagging Headway.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2076
7462 Estimation and Removal of Chlorophenolic Compounds from Paper Mill Waste Water by Electrochemical Treatment

Authors: R. Sharma, S. Kumar, C. Sharma

Abstract:

A number of toxic chlorophenolic compounds are formed during pulp bleaching. The nature and concentration of these chlorophenolic compounds largely depends upon the amount and nature of bleaching chemicals used. These compounds are highly recalcitrant and difficult to remove but are partially removed by the biochemical treatment processes adopted by the paper industry. Identification and estimation of these chlorophenolic compounds has been carried out in the primary and secondary clarified effluents from the paper mill by GCMS. Twenty-six chorophenolic compounds have been identified and estimated in paper mill waste waters. Electrochemical treatment is an efficient method for oxidation of pollutants and has successfully been used to treat textile and oil waste water. Electrochemical treatment using less expensive anode material, stainless steel electrodes has been tried to study their removal. The electrochemical assembly comprised a DC power supply, a magnetic stirrer and stainless steel (316 L) electrode. The optimization of operating conditions has been carried out and treatment has been performed under optimized treatment conditions. Results indicate that 68.7% and 83.8% of cholorphenolic compounds are removed during 2 h of electrochemical treatment from primary and secondary clarified effluent respectively. Further, there is a reduction of 65.1, 60 and 92.6% of COD, AOX and color, respectively for primary clarified and 83.8%, 75.9% and 96.8% of COD, AOX and color, respectively for secondary clarified effluent. EC treatment has also been found to increase significantly the biodegradability index of wastewater because of conversion of non- biodegradable fraction into biodegradable fraction. Thus, electrochemical treatment is an efficient method for the degradation of cholorophenolic compounds, removal of color, AOX and other recalcitrant organic matter present in paper mill waste water.

Keywords: Chlorophenolics, effluent, electrochemical treatment, wastewater.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1878
7461 The Effects of Detector Spacing on Travel Time Prediction on Freeways

Authors: Piyali Chaudhuri, Peter T. Martin, Aleksandar Z. Stevanovic, Chongkai Zhu

Abstract:

Loop detectors report traffic characteristics in real time. They are at the core of traffic control process. Intuitively, one would expect that as density of detection increases, so would the quality of estimates derived from detector data. However, as detector deployment increases, the associated operating and maintenance cost increases. Thus, traffic agencies often need to decide where to add new detectors and which detectors should continue receiving maintenance, given their resource constraints. This paper evaluates the effect of detector spacing on freeway travel time estimation. A freeway section (Interstate-15) in Salt Lake City metropolitan region is examined. The research reveals that travel time accuracy does not necessarily deteriorate with increased detector spacing. Rather, the actual location of detectors has far greater influence on the quality of travel time estimates. The study presents an innovative computational approach that delivers optimal detector locations through a process that relies on Genetic Algorithm formulation.

Keywords: Detector, Freeway, Genetic algorithm, Travel timeestimate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1646
7460 Drainage Prediction for Dam using Fuzzy Support Vector Regression

Authors: S. Wiriyarattanakun, A. Ruengsiriwatanakun, S. Noimanee

Abstract:

The drainage Estimating is an important factor in dam management. In this paper, we use fuzzy support vector regression (FSVR) to predict the drainage of the Sirikrit Dam at Uttaradit province, Thailand. The results show that the FSVR is a suitable method in drainage estimating.

Keywords: Drainage Estimation, Prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1249
7459 Knowledge Representation Based On Interval Type-2 CFCM Clustering

Authors: Myung-Won Lee, Keun-Chang Kwak

Abstract:

This paper is concerned with knowledge representation and extraction of fuzzy if-then rules using Interval Type-2 Context-based Fuzzy C-Means clustering (IT2-CFCM) with the aid of fuzzy granulation. This proposed clustering algorithm is based on information granulation in the form of IT2 based Fuzzy C-Means (IT2-FCM) clustering and estimates the cluster centers by preserving the homogeneity between the clustered patterns from the IT2 contexts produced in the output space. Furthermore, we can obtain the automatic knowledge representation in the design of Radial Basis Function Networks (RBFN), Linguistic Model (LM), and Adaptive Neuro-Fuzzy Networks (ANFN) from the numerical input-output data pairs. We shall focus on a design of ANFN in this paper. The experimental results on an estimation problem of energy performance reveal that the proposed method showed a good knowledge representation and performance in comparison with the previous works.

Keywords: IT2-FCM, IT2-CFCM, context-based fuzzy clustering, adaptive neuro-fuzzy network, knowledge representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2601
7458 The Impact of System and Data Quality on Organizational Success in the Kingdom of Bahrain

Authors: Amal M. Alrayes

Abstract:

Data and system quality play a central role in organizational success, and the quality of any existing information system has a major influence on the effectiveness of overall system performance. Given the importance of system and data quality to an organization, it is relevant to highlight their importance on organizational performance in the Kingdom of Bahrain. This research aims to discover whether system quality and data quality are related, and to study the impact of system and data quality on organizational success. A theoretical model based on previous research is used to show the relationship between data and system quality, and organizational impact. We hypothesize, first, that system quality is positively associated with organizational impact, secondly that system quality is positively associated with data quality, and finally that data quality is positively associated with organizational impact. A questionnaire was conducted among public and private organizations in the Kingdom of Bahrain. The results show that there is a strong association between data and system quality, that affects organizational success.

Keywords: Data quality, performance, system quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2101
7457 Integration of Multi-Source Data to Monitor Coral Biodiversity

Authors: K. Jitkue, W. Srisang, C. Yaiprasert, K. Jaroensutasinee, M. Jaroensutasinee

Abstract:

This study aims at using multi-source data to monitor coral biodiversity and coral bleaching. We used coral reef at Racha Islands, Phuket as a study area. There were three sources of data: coral diversity, sensor based data and satellite data.

Keywords: Coral reefs, Remote sensing, Sea surfacetemperatue, Satellite imagery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1535
7456 Leveraging xAPI in a Corporate e-Learning Environment to Facilitate the Tracking, Modelling, and Predictive Analysis of Learner Behaviour

Authors: Libor Zachoval, Daire O Broin, Oisin Cawley

Abstract:

E-learning platforms, such as Blackboard have two major shortcomings: limited data capture as a result of the limitations of SCORM (Shareable Content Object Reference Model), and lack of incorporation of Artificial Intelligence (AI) and machine learning algorithms which could lead to better course adaptations. With the recent development of Experience Application Programming Interface (xAPI), a large amount of additional types of data can be captured and that opens a window of possibilities from which online education can benefit. In a corporate setting, where companies invest billions on the learning and development of their employees, some learner behaviours can be troublesome for they can hinder the knowledge development of a learner. Behaviours that hinder the knowledge development also raise ambiguity about learner’s knowledge mastery, specifically those related to gaming the system. Furthermore, a company receives little benefit from their investment if employees are passing courses without possessing the required knowledge and potential compliance risks may arise. Using xAPI and rules derived from a state-of-the-art review, we identified three learner behaviours, primarily related to guessing, in a corporate compliance course. The identified behaviours are: trying each option for a question, specifically for multiple-choice questions; selecting a single option for all the questions on the test; and continuously repeating tests upon failing as opposed to going over the learning material. These behaviours were detected on learners who repeated the test at least 4 times before passing the course. These findings suggest that gauging the mastery of a learner from multiple-choice questions test scores alone is a naive approach. Thus, next steps will consider the incorporation of additional data points, knowledge estimation models to model knowledge mastery of a learner more accurately, and analysis of the data for correlations between knowledge development and identified learner behaviours. Additional work could explore how learner behaviours could be utilised to make changes to a course. For example, course content may require modifications (certain sections of learning material may be shown to not be helpful to many learners to master the learning outcomes aimed at) or course design (such as the type and duration of feedback).

Keywords: Compliance Course, Corporate Training, Learner Behaviours, xAPI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 549
7455 Decision Support System Based on Data Warehouse

Authors: Yang Bao, LuJing Zhang

Abstract:

Typical Intelligent Decision Support System is 4-based, its design composes of Data Warehouse, Online Analytical Processing, Data Mining and Decision Supporting based on models, which is called Decision Support System Based on Data Warehouse (DSSBDW). This way takes ETL,OLAP and DM as its implementing means, and integrates traditional model-driving DSS and data-driving DSS into a whole. For this kind of problem, this paper analyzes the DSSBDW architecture and DW model, and discusses the following key issues: ETL designing and Realization; metadata managing technology using XML; SQL implementing, optimizing performance, data mapping in OLAP; lastly, it illustrates the designing principle and method of DW in DSSBDW.

Keywords: Decision Support System, Data Warehouse, Data Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3842
7454 Estimation of the Drought Index Based on the Climatic Projections of Precipitation of the Uruguay River Basin

Authors: José Leandro Melgar Néris, Claudinéia Brazil, Luciane Teresa Salvi, Isabel Cristina Damin

Abstract:

The impact the climate change is not recent, the main variable in the hydrological cycle is the sequence and shortage of a drought, which has a significant impact on the socioeconomic, agricultural and environmental spheres. This study aims to characterize and quantify, based on precipitation climatic projections, the rainy and dry events in the region of the Uruguay River Basin, through the Standardized Precipitation Index (SPI). The database is the image that is part of the Intercomparison of Model Models, Phase 5 (CMIP5), which provides condition prediction models, organized according to the Representative Routes of Concentration (CPR). Compared to the normal set of climates in the Uruguay River Watershed through precipitation projections, seasonal precipitation increases for all proposed scenarios, with a low climate trend. From the data of this research, the idea is that this article can be used to support research and the responsible bodies can use it as a subsidy for mitigation measures in other hydrographic basins.

Keywords: Drought index, climatic projections, precipitation of the Uruguay River Basin, Standardized Precipitation Index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 564
7453 A New History Based Method to Handle the Recurring Concept Shifts in Data Streams

Authors: Hossein Morshedlou, Ahmad Abdollahzade Barforoush

Abstract:

Recent developments in storage technology and networking architectures have made it possible for broad areas of applications to rely on data streams for quick response and accurate decision making. Data streams are generated from events of real world so existence of associations, which are among the occurrence of these events in real world, among concepts of data streams is logical. Extraction of these hidden associations can be useful for prediction of subsequent concepts in concept shifting data streams. In this paper we present a new method for learning association among concepts of data stream and prediction of what the next concept will be. Knowing the next concept, an informed update of data model will be possible. The results of conducted experiments show that the proposed method is proper for classification of concept shifting data streams.

Keywords: Data Stream, Classification, Concept Shift, History.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1266
7452 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: Text mining, topic extraction, independent, incremental, independent component analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1033
7451 Economic Loss due to Ganoderma Disease in Oil Palm

Authors: K. Assis, K. P. Chong, A. S. Idris, C. M. Ho

Abstract:

Oil palm or Elaeis guineensis is considered as the golden crop in Malaysia. But oil palm industry in this country is now facing with the most devastating disease called as Ganoderma Basal Stem Rot disease. The objective of this paper is to analyze the economic loss due to this disease. There were three commercial oil palm sites selected for collecting the required data for economic analysis. Yield parameter used to measure the loss was the total weight of fresh fruit bunch in six months. The predictors include disease severity, change in disease severity, number of infected neighbor palms, age of palm, planting generation, topography, and first order interaction variables. The estimation model of yield loss was identified by using backward elimination based regression method. Diagnostic checking was conducted on the residual of the best yield loss model. The value of mean absolute percentage error (MAPE) was used to measure the forecast performance of the model. The best yield loss model was then used to estimate the economic loss by using the current monthly price of fresh fruit bunch at mill gate.

Keywords: Ganoderma, oil palm, regression model, yield loss, economic loss.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3200
7450 A Framework for Data Mining Based Multi-Agent: An Application to Spatial Data

Authors: H. Baazaoui Zghal, S. Faiz, H. Ben Ghezala

Abstract:

Data mining is an extraordinarily demanding field referring to extraction of implicit knowledge and relationships, which are not explicitly stored in databases. A wide variety of methods of data mining have been introduced (classification, characterization, generalization...). Each one of these methods includes more than algorithm. A system of data mining implies different user categories,, which mean that the user-s behavior must be a component of the system. The problem at this level is to know which algorithm of which method to employ for an exploratory end, which one for a decisional end, and how can they collaborate and communicate. Agent paradigm presents a new way of conception and realizing of data mining system. The purpose is to combine different algorithms of data mining to prepare elements for decision-makers, benefiting from the possibilities offered by the multi-agent systems. In this paper the agent framework for data mining is introduced, and its overall architecture and functionality are presented. The validation is made on spatial data. Principal results will be presented.

Keywords: Databases, data mining, multi-agent, spatial datamart.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2030
7449 Latent Topic Based Medical Data Classification

Authors: Jian-hua Yeh, Shi-yi Kuo

Abstract:

This paper discusses the classification process for medical data. In this paper, we use the data from ACM KDDCup 2008 to demonstrate our classification process based on latent topic discovery. In this data set, the target set and outliers are quite different in their nature: target set is only 0.6% size in total, while the outliers consist of 99.4% of the data set. We use this data set as an example to show how we dealt with this extremely biased data set with latent topic discovery and noise reduction techniques. Our experiment faces two major challenge: (1) extremely distributed outliers, and (2) positive samples are far smaller than negative ones. We try to propose a suitable process flow to deal with these issues and get a best AUC result of 0.98.

Keywords: classification, latent topics, outlier adjustment, feature scaling

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1628
7448 Likelihood Estimation for Stochastic Epidemics with Heterogeneous Mixing Populations

Authors: Yilun Shang

Abstract:

We consider a heterogeneously mixing SIR stochastic epidemic process in populations described by a general graph. Likelihood theory is developed to facilitate statistic inference for the parameters of the model under complete observation. We show that these estimators are asymptotically Gaussian unbiased estimates by using a martingale central limit theorem.

Keywords: statistic inference, maximum likelihood, epidemicmodel, heterogeneous mixing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1393
7447 Subpixel Detection of Circular Objects Using Geometric Property

Authors: Wen-Yen Wu, Wen-Bin Yu

Abstract:

In this paper, we propose a method for detecting circular shapes with subpixel accuracy. First, the geometric properties of circles have been used to find the diameters as well as the circumference pixels. The center and radius are then estimated by the circumference pixels. Both synthetic and real images have been tested by the proposed method. The experimental results show that the new method is efficient.

Keywords: Subpixel, least squares estimation, circle detection, Hough transformation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2116