Search results for: Monte Carlo data

7345 Statically Fused Unbiased Converted Measurements Kalman Filter

Authors: Zhengkun Guo, Yanbin Li, Wenqing Wang, Bo Zou

Abstract:

Active radar and sonar systems often report Doppler measurements in addition to the position measurements such as range and bearing. The tracker can perform better by making full use of the Doppler measurements. However, due to the high nonlinearity of the Doppler measurements with respect to the target state in the Cartesian coordinate systems, those measurements are not always fully exploited. This paper mainly focuses on dealing with the Doppler measurements as well as the position measurements in Polar coordinates. The Statically Fused Converted Position and Doppler Measurements Kalman Filter (SF-CMKF) with additive debiased measurement conversion has been presented. However, the exact compensation for the bias of the measurement conversion are multiplicative and depend on the statistics of the cosine of the angle measurement errors. As a result, the consistency and performance of the SF-CMKF may be suboptimal in the large angle error situations. In this paper, the multiplicative unbiased position and Doppler measurement conversion for two-dimensional (Polar-to-Cartesian) tracking are derived, and the SF-CMKF is improved by using those conversion. Monte Carlo simulations are presented to demonstrate the statistic consistency of the multiplicative unbiased conversion and the superior performance of the modified SF-CMKF (SF-UCMKF).

Keywords: Measurement conversion, Doppler, Kalman filter, estimation, tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 321

7344 Direct Measurements of Wind Data over 100 Meters above the Ground in the Site of Lendinara, Italy

Authors: A. Dal Monte, M. Raciti Castelli, G. B. Bellato, L. Stevanato, E. Benini

Abstract:

The wind resource in the Italian site of Lendinara (RO) is analyzed through a systematic anemometric campaign performed on the top of the bell tower, at an altitude of over 100 m above the ground. Both the average wind speed and the Weibull distribution are computed. The resulting average wind velocity is in accordance with the numerical predictions of the Italian Wind Atlas, confirming the accuracy of the extrapolation of wind data adopted for the evaluation of wind potential at higher altitudes with respect to the commonly placed measurement stations.

Keywords: Anemometric campaign, wind resource, Weibull distribution, wind atlas

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1928

7343 Reliability Based Performance Evaluation of Stone Column Improved Soft Ground

Authors: A. GuhaRay, C. V. S. P. Kiranmayi, S. Rudraraju

Abstract:

The present study considers the effect of variation of different geotechnical random variables in the design of stone column-foundation systems for assessing the bearing capacity and consolidation settlement of highly compressible soil. The soil and stone column properties, spacing, diameter and arrangement of stone columns are considered as the random variables. Probability of failure (P_f) is computed for a target degree of consolidation and a target safe load by Monte Carlo Simulation (MCS). The study shows that the variation in coefficient of radial consolidation (c_r) and cohesion of soil (c_s) are two most important factors influencing Pf. If the coefficient of variation (COV) of c_r exceeds 20%, P_f exceeds 0.001, which is unsafe following the guidelines of US Army Corps of Engineers. The bearing capacity also exceeds its safe value for COV of c_s > 30%. It is also observed that as the spacing between the stone column increases, the probability of reaching a target degree of consolidation decreases. Accordingly, design guidelines, considering both consolidation and bearing capacity of improved ground, are proposed for different spacing and diameter of stone columns and geotechnical random variables.

Keywords: Bearing capacity, consolidation, geotechnical random variables, probability of failure, stone columns.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1128

7342 Effect of Size of the Step in the Response Surface Methodology using Nonlinear Test Functions

Authors: Jesús Everardo Olguín Tiznado, Rafael García Martínez, Claudia Camargo Wilson, Juan Andrés López Barreras, Everardo Inzunza González, Javier Ordorica Villalvazo

Abstract:

The response surface methodology (RSM) is a collection of mathematical and statistical techniques useful in the modeling and analysis of problems in which the dependent variable receives the influence of several independent variables, in order to determine which are the conditions under which should operate these variables to optimize a production process. The RSM estimated a regression model of first order, and sets the search direction using the method of maximum / minimum slope up / down MMS U/D. However, this method selects the step size intuitively, which can affect the efficiency of the RSM. This paper assesses how the step size affects the efficiency of this methodology. The numerical examples are carried out through Monte Carlo experiments, evaluating three response variables: efficiency gain function, the optimum distance and the number of iterations. The results in the simulation experiments showed that in response variables efficiency and gain function at the optimum distance were not affected by the step size, while the number of iterations is found that the efficiency if it is affected by the size of the step and function type of test used.

Keywords: RSM, dependent variable, independent variables, efficiency, simulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1955

7341 Diagnosing the Cause and its Timing of Changes in Multivariate Process Mean Vector from Quality Control Charts using Artificial Neural Network

Authors: Farzaneh Ahmadzadeh

Abstract:

Quality control charts are very effective in detecting out of control signals but when a control chart signals an out of control condition of the process mean, searching for a special cause in the vicinity of the signal time would not always lead to prompt identification of the source(s) of the out of control condition as the change point in the process parameter(s) is usually different from the signal time. It is very important to manufacturer to determine at what point and which parameters in the past caused the signal. Early warning of process change would expedite the search for the special causes and enhance quality at lower cost. In this paper the quality variables under investigation are assumed to follow a multivariate normal distribution with known means and variance-covariance matrix and the process means after one step change remain at the new level until the special cause is being identified and removed, also it is supposed that only one variable could be changed at the same time. This research applies artificial neural network (ANN) to identify the time the change occurred and the parameter which caused the change or shift. The performance of the approach was assessed through a computer simulation experiment. The results show that neural network performs effectively and equally well for the whole shift magnitude which has been considered.

Keywords: Artificial neural network, change point estimation, monte carlo simulation, multivariate exponentially weighted movingaverage

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1340

7340 A Case Study on the Numerical-Probability Approach for Deep Excavation Analysis

Authors: Komeil Valipourian

Abstract:

Urban advances and the growing need for developing infrastructures has increased the importance of deep excavations. In this study, after the introducing probability analysis as an important issue, an attempt has been made to apply it for the deep excavation project of Bangkok’s Metro as a case study. For this, the numerical probability model has been developed based on the Finite Difference Method and Monte Carlo sampling approach. The results indicate that disregarding the issue of probability in this project will result in an inappropriate design of the retaining structure. Therefore, probabilistic redesign of the support is proposed and carried out as one of the applications of probability analysis. A 50% reduction in the flexural strength of the structure increases the failure probability just by 8% in the allowable range and helps improve economic conditions, while maintaining mechanical efficiency. With regard to the lack of efficient design in most deep excavations, by considering geometrical and geotechnical variability, an attempt was made to develop an optimum practical design standard for deep excavations based on failure probability. On this basis, a practical relationship is presented for estimating the maximum allowable horizontal displacement, which can help improve design conditions without developing the probability analysis.

Keywords: Numerical probability modeling, deep excavation, allowable maximum displacement, finite difference method, FDM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 626

7339 Probabilistic Method of Wind Generation Placement for Congestion Management

Authors: S. Z. Moussavi, A. Badri, F. Rastegar Kashkooli

Abstract:

Wind farms (WFs) with high level of penetration are being established in power systems worldwide more rapidly than other renewable resources. The Independent System Operator (ISO), as a policy maker, should propose appropriate places for WF installation in order to maximize the benefits for the investors. There is also a possibility of congestion relief using the new installation of WFs which should be taken into account by the ISO when proposing the locations for WF installation. In this context, efficient wind farm (WF) placement method is proposed in order to reduce burdens on congested lines. Since the wind speed is a random variable and load forecasts also contain uncertainties, probabilistic approaches are used for this type of study. AC probabilistic optimal power flow (P-OPF) is formulated and solved using Monte Carlo Simulations (MCS). In order to reduce computation time, point estimate methods (PEM) are introduced as efficient alternative for time-demanding MCS. Subsequently, WF optimal placement is determined using generation shift distribution factors (GSDF) considering a new parameter entitled, wind availability factor (WAF). In order to obtain more realistic results, N-1 contingency analysis is employed to find the optimal size of WF, by means of line outage distribution factors (LODF). The IEEE 30-bus test system is used to show and compare the accuracy of proposed methodology.

Keywords: Probabilistic optimal power flow, Wind power, Pointestimate methods, Congestion management

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1841

7338 The Application of Real Options to Capital Budgeting

Authors: George Yungchih Wang

Abstract:

Real options theory suggests that managerial flexibility embedded within irreversible investments can account for a significant value in project valuation. Although the argument has become the dominant focus of capital investment theory over decades, yet recent survey literature in capital budgeting indicates that corporate practitioners still do not explicitly apply real options in investment decisions. In this paper, we explore how real options decision criteria can be transformed into equivalent capital budgeting criteria under the consideration of uncertainty, assuming that underlying stochastic process follows a geometric Brownian motion (GBM), a mixed diffusion-jump (MX), or a mean-reverting process (MR). These equivalent valuation techniques can be readily decomposed into conventional investment rules and “option impacts", the latter of which describe the impacts on optimal investment rules with the option value considered. Based on numerical analysis and Monte Carlo simulation, three major findings are derived. First, it is shown that real options could be successfully integrated into the mindset of conventional capital budgeting. Second, the inclusion of option impacts tends to delay investment. It is indicated that the delay effect is the most significant under a GBM process and the least significant under a MR process. Third, it is optimal to adopt the new capital budgeting criteria in investment decision-making and adopting a suboptimal investment rule without considering real options could lead to a substantial loss in value.

Keywords: real options, capital budgeting, geometric Brownianmotion, mixed diffusion-jump, mean-reverting process

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2729

7337 Ion Thruster Grid Lifetime Assessment Based on Its Structural Failure

Authors: Juan Li, Jiawen Qiu, Yuchuan Chu, Tianping Zhang, Wei Meng, Yanhui Jia, Xiaohui Liu

Abstract:

This article developed an ion thruster optic system sputter erosion depth numerical 3D model by IFE-PIC (Immersed Finite Element-Particle-in-Cell) and Mont Carlo method, and calculated the downstream surface sputter erosion rate of accelerator grid; compared with LIPS-200 life test data. The results of the numerical model are in reasonable agreement with the measured data. Finally, we predicted the lifetime of the 20cm diameter ion thruster via the erosion data obtained with the model. The ultimate result demonstrated that under normal operating condition, the erosion rate of the grooves wears on the downstream surface of the accelerator grid is 34.6μm⁄1000h, which means the conservative lifetime until structural failure occurring on the accelerator grid is 11500 hours.

Keywords: Ion thruster, accelerator gird, sputter erosion, lifetime assessment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1960

7336 Probability-Based Damage Detection of Structures Using Model Updating with Enhanced Ideal Gas Molecular Movement Algorithm

Authors: M. R. Ghasemi, R. Ghiasi, H. Varaee

Abstract:

Model updating method has received increasing attention in damage detection structures based on measured modal parameters. Therefore, a probability-based damage detection (PBDD) procedure based on a model updating procedure is presented in this paper, in which a one-stage model-based damage identification technique based on the dynamic features of a structure is investigated. The presented framework uses a finite element updating method with a Monte Carlo simulation that considers the uncertainty caused by measurement noise. Enhanced ideal gas molecular movement (EIGMM) is used as the main algorithm for model updating. Ideal gas molecular movement (IGMM) is a multiagent algorithm based on the ideal gas molecular movement. Ideal gas molecules disperse rapidly in different directions and cover all the space inside. This is embedded in the high speed of molecules, collisions between them and with the surrounding barriers. In IGMM algorithm to accomplish the optimal solutions, the initial population of gas molecules is randomly generated and the governing equations related to the velocity of gas molecules and collisions between those are utilized. In this paper, an enhanced version of IGMM, which removes unchanged variables after specified iterations, is developed. The proposed method is implemented on two numerical examples in the field of structural damage detection. The results show that the proposed method can perform well and competitive in PBDD of structures.

Keywords: Enhanced ideal gas molecular movement, ideal gas molecular movement, model updating method, probability-based damage detection, uncertainty quantification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1045

7335 The MUST ADS Concept

Authors: J-B. Clavel, N. Thiollière, B. Mouginot

Abstract:

The presented work is motivated by a French law regarding nuclear waste management. A new conceptual Accelerator Driven System (ADS) designed for the Minor Actinides (MA) transmutation has been assessed by numerical simulation. The MUltiple Spallation Target (MUST) ADS combines high thermal power (up to 1.4 GWth) and high specific power. A 30 mA and 1 GeV proton beam is divided into three secondary beams transmitted on three liquid lead-bismuth spallation targets. Neutron and thermalhydraulic simulations have been performed with the code MURE, based on the Monte-Carlo transport code MCNPX. A methodology has been developed to define characteristic of the MUST ADS concept according to a specific transmutation scenario. The reference scenario is based on a MA flux (neptunium, americium and curium) providing from European Fast Reactor (EPR) and a plutonium multireprocessing strategy is accounted for. The MUST ADS reference concept is a sodium cooled fast reactor. The MA fuel at equilibrium is mixed with MgO inert matrix to limit the core reactivity and improve the fuel thermal conductivity. The fuel is irradiated over five years. Five years of cooling and two years for the fuel fabrication are taken into account. The MUST ADS reference concept burns about 50% of the initial MA inventory during a complete cycle. In term of mass, up to 570 kg/year are transmuted in one concept. The methodology to design the MUST ADS and to calculate fuel composition at equilibrium is precisely described in the paper. A detailed fuel evolution analysis is performed and the reference scenario is compared to a scenario where only americium transmutation is performed.

Keywords: Accelerator Driven System, double strata scenario, minor actinides, MUST, transmutation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1655

7334 Idiopathic Constipation can be Subdivided in Clinical Subtypes: Data Mining by Cluster Analysis on a Population based Study

Authors: Mauro Giacomini, Stefania Bertone, Carlo Mansi, Pietro Dulbecco, Vincenzo Savarino

Abstract:

The prevalence of non organic constipation differs from country to country and the reliability of the estimate rates is uncertain. Moreover, the clinical relevance of subdividing the heterogeneous functional constipation disorders into pre-defined subgroups is largely unknown.. Aim: to estimate the prevalence of constipation in a population-based sample and determine whether clinical subgroups can be identified. An age and gender stratified sample population from 5 Italian cities was evaluated using a previously validated questionnaire. Data mining by cluster analysis was used to determine constipation subgroups. Results: 1,500 complete interviews were obtained from 2,083 contacted households (72%). Self-reported constipation correlated poorly with symptombased constipation found in 496 subjects (33.1%). Cluster analysis identified four constipation subgroups which correlated to subgroups identified according to pre-defined symptom criteria. Significant differences in socio-demographics and lifestyle were observed among subgroups.

Keywords: Cluster analysis, constipation, data mining, statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1265

7333 Nonlinear Finite Element Modeling of Deep Beam Resting on Linear and Nonlinear Random Soil

Authors: M. Seguini, D. Nedjar

Abstract:

An accuracy nonlinear analysis of a deep beam resting on elastic perfectly plastic soil is carried out in this study. In fact, a nonlinear finite element modeling for large deflection and moderate rotation of Euler-Bernoulli beam resting on linear and nonlinear random soil is investigated. The geometric nonlinear analysis of the beam is based on the theory of von Kàrmàn, where the Newton-Raphson incremental iteration method is implemented in a Matlab code to solve the nonlinear equation of the soil-beam interaction system. However, two analyses (deterministic and probabilistic) are proposed to verify the accuracy and the efficiency of the proposed model where the theory of the local average based on the Monte Carlo approach is used to analyze the effect of the spatial variability of the soil properties on the nonlinear beam response. The effect of six main parameters are investigated: the external load, the length of a beam, the coefficient of subgrade reaction of the soil, the Young’s modulus of the beam, the coefficient of variation and the correlation length of the soil’s coefficient of subgrade reaction. A comparison between the beam resting on linear and nonlinear soil models is presented for different beam’s length and external load. Numerical results have been obtained for the combination of the geometric nonlinearity of beam and material nonlinearity of random soil. This comparison highlighted the need of including the material nonlinearity and spatial variability of the soil in the geometric nonlinear analysis, when the beam undergoes large deflections.

Keywords: Finite element method, geometric nonlinearity, material nonlinearity, soil-structure interaction, spatial variability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1885

7332 Big Data: Big Challenges to Privacy and Data Protection

Authors: Abu Bakar Munir, Siti Hajar Mohd Yasin, Firdaus Muhammad-Sukki

Abstract:

This paper seeks to analyse the benefits of big data and more importantly the challenges it pose to the subject of privacy and data protection. First, the nature of big data will be briefly deliberated before presenting the potential of big data in the present days. Afterwards, the issue of privacy and data protection is highlighted before discussing the challenges of implementing this issue in big data. In conclusion, the paper will put forward the debate on the adequacy of the existing legal framework in protecting personal data in the era of big data.

Keywords: Big data, data protection, information, privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3876

7331 Arsenic Mobility from Mining Tailings of Monte San Nicolas to Presa de Mata in Guanajuato, Mexico

Authors: I. Cano-Aguilera, B. E. Rubio-Campos, G. De la Rosa, A. F. Aguilera-Alvarado

Abstract:

Mining tailings represent a generating source of rich heavy metal material with a potential danger the public health and the environment, since these metals, under certain conditions, can leach and contaminate aqueous systems that serve like supplying potable water sources. The strategy for this work is based on the observation, experimentation and the simulation that can be obtained by binding real answers of the hydrodynamic behavior of metals leached from mining tailings, and the applied mathematics that provides the logical structure to decipher the individual effects of the general physicochemical phenomenon. The case of study presented herein focuses on mining tailings deposits located in Monte San Nicolas, Guanajuato, Mexico, an abandoned mine. This was considered the contamination source that under certain physicochemical conditions can favor the metal leaching, and its transport towards aqueous systems. In addition, the cartography, meteorology, geology and the hydrodynamics and hydrological characteristics of the place, will be helpful in determining the way and the time in which these systems can interact. Preliminary results demonstrated that arsenic presents a great mobility, since this one was identified in several superficial aqueous systems of the micro watershed, as well as in sediments in concentrations that exceed the established maximum limits in the official norms. Also variations in pH and potential oxide-reduction were registered, conditions that favor the presence of different species from this element its solubility and therefore its mobility.

Keywords: Arsenic, mining tailings, transport.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1650

7330 Kurtosis, Renyi's Entropy and Independent Component Scalp Maps for the Automatic Artifact Rejection from EEG Data

Authors: Antonino Greco, Nadia Mammone, Francesco Carlo Morabito, Mario Versaci

Abstract:

The goal of this work is to improve the efficiency and the reliability of the automatic artifact rejection, in particular from the Electroencephalographic (EEG) recordings. Artifact rejection is a key topic in signal processing. The artifacts are unwelcome signals that may occur during the signal acquisition and that may alter the analysis of the signals themselves. A technique for the automatic artifact rejection, based on the Independent Component Analysis (ICA) for the artifact extraction and on some high order statistics such as kurtosis and Shannon-s entropy, was proposed some years ago in literature. In this paper we enhance this technique introducing the Renyi-s entropy. The performance of our method was tested exploiting the Independent Component scalp maps and it was compared to the performance of the method in literature and it showed to outperform it.

Keywords: Artifact, EEG, Renyi's entropy, independent component analysis, kurtosis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2387

7329 Data Preprocessing for Supervised Leaning

Authors: S. B. Kotsiantis, D. Kanellopoulos, P. E. Pintelas

Abstract:

Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.

Keywords: Data mining, feature selection, data cleaning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5960

7328 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: Analytics, Big Data in Education, Hadoop, Learning Analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4832

7327 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2572

7326 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1522

7325 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2432

7324 Comparative Analysis of Diverse Collection of Big Data Analytics Tools

Authors: S. Vidhya, S. Sarumathi, N. Shanthi

Abstract:

Over the past era, there have been a lot of efforts and studies are carried out in growing proficient tools for performing various tasks in big data. Recently big data have gotten a lot of publicity for their good reasons. Due to the large and complex collection of datasets it is difficult to process on traditional data processing applications. This concern turns to be further mandatory for producing various tools in big data. Moreover, the main aim of big data analytics is to utilize the advanced analytic techniques besides very huge, different datasets which contain diverse sizes from terabytes to zettabytes and diverse types such as structured or unstructured and batch or streaming. Big data is useful for data sets where their size or type is away from the capability of traditional relational databases for capturing, managing and processing the data with low-latency. Thus the out coming challenges tend to the occurrence of powerful big data tools. In this survey, a various collection of big data tools are illustrated and also compared with the salient features.

Keywords: Big data, Big data analytics, Business analytics, Data analysis, Data visualization, Data discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3743

7323 Multi-labeled Data Expressed by a Set of Labels

Authors: Tetsuya Furukawa, Masahiro Kuzunishi

Abstract:

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1269

7322 Environmental Impact of Sustainability Dispersion of Chlorine Releases in Coastal Zone of Alexandra: Spatial-Ecological Modeling

Authors: Mohammed El Raey, Moustafa Osman Mohammed

Abstract:

The spatial-ecological modeling is relating sustainable dispersions with social development. Sustainability with spatial-ecological model gives attention to urban environments in the design review management to comply with Earth’s system. Naturally exchanged patterns of ecosystems have consistent and periodic cycles to preserve energy flows and materials in Earth’s system. The Probabilistic Risk Assessment (PRA) technique is utilized to assess the safety of an industrial complex. The other analytical approach is the Failure-Safe Mode and Effect Analysis (FMEA) for critical components. The plant safety parameters are identified for engineering topology as employed in assessment safety of industrial ecology. In particular, the most severe accidental release of hazardous gaseous is postulated, analyzed and assessment in industrial region. The IAEA-safety assessment procedure is used to account the duration and rate of discharge of liquid chlorine. The ecological model of plume dispersion width and concentration of chlorine gas in the downwind direction is determined using Gaussian Plume Model in urban and rural areas and presented with SURFER®. The prediction of accident consequences is traced in risk contour concentration lines. The local greenhouse effect is predicted with relevant conclusions. The spatial-ecological model is predicted for multiple factors distribution schemes of multi-criteria analysis. The input–output analysis is explored from the spillover effect, and we conducted Monte Carlo simulations for sensitivity analysis. Their unique structure is balanced within “equilibrium patterns”, such as the composite index for biosphere with collective structure of many distributed feedback flows. These dynamic structures are related to have their physical and chemical properties and enable a gradual and prolonged incremental pattern. While this spatial model structure argues from ecology, resource savings, static load design, financial and other pragmatic reasons, the outcomes are not decisive in an artistic/architectural perspective. The hypothesis is deployed to unify analytic and analogical spatial structure in development urban environments using optimization loads as an example of integrated industrial structure where the process is based on engineering topology of systems ecology.

Keywords: Spatial-ecological modeling, spatial structure orientation impact, composite structure, industrial ecology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 120

7321 The Comparison of Data Replication in Distributed Systems

Authors: Iman Zangeneh, Mostafa Moradi, Ali Mokhtarbaf

Abstract:

The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.

Keywords: data replication, data hiding, consistency, dynamicdata replication strategy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1601

7320 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1969

7319 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: Big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2017

7318 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2745

7317 Automatic Real-Patient Medical Data De-Identification for Research Purposes

Authors: Petr Vcelak, Jana Kleckova

Abstract:

Our Medicine-oriented research is based on a medical data set of real patients. It is a security problem to share patient private data with peoples other than clinician or hospital staff. We have to remove person identification information from medical data. The medical data without private data are available after a de-identification process for any research purposes. In this paper, we introduce an universal automatic rule-based de-identification application to do all this stuff on an heterogeneous medical data. A patient private identification is replaced by an unique identification number, even in burnedin annotation in pixel data. The identical identification is used for all patient medical data, so it keeps relationships in a data. Hospital can take an advantage of a research feedback based on results.

Keywords: DASTA, De-identification, DICOM, Health Level Seven, Medical data, OCR, Personal data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1605

7316 Analyzing Multi-Labeled Data Based on the Roll of a Concept against a Semantic Range

Authors: Masahiro Kuzunishi, Tetsuya Furukawa, Ke Lu

Abstract:

Classifying data hierarchically is an efficient approach to analyze data. Data is usually classified into multiple categories, or annotated with a set of labels. To analyze multi-labeled data, such data must be specified by giving a set of labels as a semantic range. There are some certain purposes to analyze data. This paper shows which multi-labeled data should be the target to be analyzed for those purposes, and discusses the role of a label against a set of labels by investigating the change when a label is added to the set of labels. These discussions give the methods for the advanced analysis of multi-labeled data, which are based on the role of a label against a semantic range.

Keywords: Classification Hierarchies, Data Analysis, Multilabeled Data, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1182