Search results for: Spatial data analysis

13370 Growing Self Organising Map Based Exploratory Analysis of Text Data

Authors: Sumith Matharage, Damminda Alahakoon

Abstract:

Textual data plays an important role in the modern world. The possibilities of applying data mining techniques to uncover hidden information present in large volumes of text collections is immense. The Growing Self Organizing Map (GSOM) is a highly successful member of the Self Organising Map family and has been used as a clustering and visualisation tool across wide range of disciplines to discover hidden patterns present in the data. A comprehensive analysis of the GSOM’s capabilities as a text clustering and visualisation tool has so far not been published. These functionalities, namely map visualisation capabilities, automatic cluster identification and hierarchical clustering capabilities are presented in this paper and are further demonstrated with experiments on a benchmark text corpus.

Keywords: Text Clustering, Growing Self Organizing Map, Automatic Cluster Identification, Hierarchical Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1995

13369 Long Term Variability of Temperature in Armenia in the Context of Climate Change

Authors: Hrachuhi Galstyan, Lucian Sfîcă, Pavel Ichim

Abstract:

The purpose of this study is to analyze the temporal and spatial variability of thermal conditions in the Republic of Armenia. The paper describes annual fluctuations in air temperature. Research has been focused on case study region of Armenia and surrounding areas, where long–term measurements and observations of weather conditions have been performed within the National Meteorological Service of Armenia and its surrounding areas. The study contains yearly air temperature data recorded between 1961- 2012. Mann-Kendal test and the autocorrelation function were applied to detect the change trend of annual mean temperature, as well as other parametric and non-parametric tests searching to find the presence of some breaks in the long term evolution of temperature. The analysis of all records reveals a tendency mostly towards warmer years, with increased temperatures especially in valleys and inner basins. The maximum temperature increase is up to 1,5°C. Negative results have not been observed in Armenia. The patterns of temperature change have been observed since the 1990’s over much of the Armenian territory. The climate in Armenia was influenced by global change in the last 2 decades, as results from the methods employed within the study.

Keywords: Air temperature, long-term variability, trend, climate change.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2214

13368 Quality of Groundwater in the Shallow Aquifers of a Paddy Dominated Agricultural River Basin, Kerala, India

Authors: N. Kannan, Sabu Joseph

Abstract:

Groundwater is an essential and vital component of our life support system. The groundwater resources are being utilized for drinking, irrigation and industrial purposes. There is growing concern on deterioration of groundwater quality due to geogenic and anthropogenic activities. Groundwater, being a fragile must be carefully managed to maintain its purity within standard limits. So, quality assessment and management are to be carried out hand-in-hand to have a pollution free environment and for a sustainable use. In order to assess the quality for consumption by human beings and for use in agriculture, the groundwater from the shallow aquifers (dug well) in the Palakkad and Chittur taluks of Bharathapuzha river basin - a paddy dominated agricultural basin (order=8th; L= 209 Km; Area = 6186 Km2), Kerala, India, has been selected. The water samples (n= 120) collected for various seasons, viz., monsoon-MON (August, 2005), postmonsoon-POM (December, 2005) and premonsoon-PRM (April, 2006), were analyzed for important physico-chemical attributes. Spatial and temporal variation of attributes do exist in the study area, and based on major cations and anions, different hydrochemical facies have been identified. Using Gibbs'diagram, rock dominance has been identified as the mechanism controlling groundwater chemistry. Further, the suitability of water for irrigation was determined by analyzing salinity hazard indicated by sodium adsorption ratio (SAR), residual sodium carbonate (RSC) and sodium percent (%Na). Finally, stress zones in the study area were delineated using Arc GIS spatial analysis and various management options were recommended to restore the ecosystem.

Keywords: Groundwater quality, agricultural basin, Kerala, India.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2597

13367 Design and Analysis of Gauge R&R Studies: Making Decisions Based on ANOVA Method

Authors: Afrooz Moatari Kazerouni

Abstract:

In a competitive production environment, critical decision making are based on data resulted by random sampling of product units. Efficiency of these decisions depends on data quality and also their reliability scale. This point leads to the necessity of a reliable measurement system. Therefore, the conjecture process and analysing the errors contributes to a measurement system known as Measurement System Analysis (MSA). The aim of this research is on determining the necessity and assurance of extensive development in analysing measurement systems, particularly with the use of Repeatability and Reproducibility Gages (GR&R) to improve physical measurements. Nowadays in productive industries, repeatability and reproducibility gages released so well but they are not applicable as well as other measurement system analysis methods. To get familiar with this method and gain a feedback in improving measurement systems, this survey would be on “ANOVA" method as the most widespread way of calculating Repeatability and Reproducibility (R&R).

Keywords: Analysis of Variance (ANOVA), MeasurementSystem Analysis (MSA), Part-Operator interaction effect, Repeatability and Reproducibility.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4667

13366 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: Analytics, Big Data in Education, Hadoop, Learning Analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4872

13365 EDULOGIC+ - Knowledge Management through Data Analysis in Education

Authors: Alok Sharma, Dr. Harvinder S. Saini, Raviteja Tiruvury

Abstract:

This paper outlines the application of Knowledge Management (KM) principles in the context of Educational institutions. The paper caters to the needs of the engineering institutions for imparting quality education by delineating the instruction delivery process in a highly structured, controlled and quantified manner. This is done using a software tool EDULOGIC+. The central idea has been based on the engineering education pattern in Indian Universities/ Institutions. The data, contents and results produced over contiguous years build the necessary ground for managing the related accumulated knowledge. Application of KM has been explained using certain examples of data analysis and knowledge extraction.

Keywords: Education software system, information system, knowledge management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1751

13364 A Novel Steganographic Method for Gray-Level Images

Authors: Ahmad T. Al-Taani, Abdullah M. AL-Issa

Abstract:

In this work we propose a novel Steganographic method for hiding information within the spatial domain of the gray scale image. The proposed approach works by dividing the cover into blocks of equal sizes and then embeds the message in the edge of the block depending on the number of ones in left four bits of the pixel. The proposed approach is tested on a database consists of 100 different images. Experimental results, compared with other methods, showed that the proposed approach hide more large information and gave a good visual quality stego-image that can be seen by human eyes.

Keywords: Data Embedding, Cryptography, Watermarking, Steganography, Least Significant Bit, Information Hiding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2265

13363 An Improved Preprocessing for Biosonar Target Classification

Authors: Turgay Temel, John Hallam

Abstract:

An improved processing description to be employed in biosonar signal processing in a cochlea model is proposed and examined. It is compared to conventional models using a modified discrimination analysis and both are tested. Their performances are evaluated with echo data captured from natural targets (trees).Results indicate that the phase characteristics of low-pass filters employed in the echo processing have a significant effect on class separability for this data.

Keywords: Cochlea model, discriminant analysis, neurospikecoding, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1491

13362 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

Authors: Lily Ingsrisawang, Tasanee Nacharoen

Abstract:

The problems arising from unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many researchers have found that the performance of existing classifiers tends to be biased towards the majority class. The k-nearest neighbors’ nonparametric discriminant analysis is a method that was proposed for classifying unbalanced classes with good performance. In this study, the methods of discriminant analysis are of interest in investigating misclassification error rates for classimbalanced data of three diabetes risk groups. The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification of class-imbalanced data of diabetes risk groups. Data from a project maintaining healthy conditions for 599 employees of a government hospital in Bangkok were obtained for the classification problem. The employees were divided into three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data including the variables of diabetes risk group, age, gender, blood glucose, and BMI were analyzed and bootstrapped for 50 and 100 samples, 599 observations per sample, for additional estimation of the misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples showed nonnormality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. Searching the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions of (0.90:0.05:0.05), (0.80: 0.10: 0.10) and (0.70, 0.15, 0.15). The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k=3 or k=4 and the defined prior probabilities of non-risk: risk: diabetic as 0.90: 0.05:0.05 or 0.80:0.10:0.10 gave the smallest error rate of misclassification. The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

Keywords: Bootstrap, diabetes risk groups, error rate, k-nearest neighbors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2007

13361 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2610

13360 Denoising by Spatial Domain Averaging for Wireless Local Area Network Terminal Localization

Authors: Diego Felix, Eugene Hyun, Michael McGuire, Mihai Sima

Abstract:

Terminal localization for indoor Wireless Local Area Networks (WLANs) is critical for the deployment of location-aware computing inside of buildings. A major challenge is obtaining high localization accuracy in presence of fluctuations of the received signal strength (RSS) measurements caused by multipath fading. This paper focuses on reducing the effect of the distance-varying noise by spatial filtering of the measured RSS. Two different survey point geometries are tested with the noise reduction technique: survey points arranged in sets of clusters and survey points uniformly distributed over the network area. The results show that the location accuracy improves by 16% when the filter is used and by 18% when the filter is applied to a clustered survey set as opposed to a straight-line survey set. The estimated locations are within 2 m of the true location, which indicates that clustering the survey points provides better localization accuracy due to superior noise removal.

Keywords: Position measurement, Wireless LAN, Radio navigation, Filtering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1520

13359 Representing Data without Lost Compression Properties in Time Series: A Review

Authors: Nabilah Filzah Mohd Radzuan, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Uncertain data is believed to be an important issue in building up a prediction model. The main objective in the time series uncertainty analysis is to formulate uncertain data in order to gain knowledge and fit low dimensional model prior to a prediction task. This paper discusses the performance of a number of techniques in dealing with uncertain data specifically those which solve uncertain data condition by minimizing the loss of compression properties.

Keywords: Compression properties, uncertainty, uncertain time series, mining technique, weather prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1619

13358 Greening the Greyfields: Unlocking the Redevelopment Potential of the Middle Suburbs in Australian Cities

Authors: Peter Newton, Peter Newman, Stephen Glackin, Roman Trubka

Abstract:

Pressures for urban redevelopment are intensifying in all large cities. A new logic for urban development is required – green urbanism – that provides a spatial framework for directing population and investment inwards to brownfields and greyfields precincts, rather than outwards to the greenfields. This represents both a major opportunity and a major challenge for city planners in pluralist liberal democracies. However, plans for more compact forms of urban redevelopment are stalling in the face of community resistance. A new paradigm and spatial planning platform is required that will support timely multi-level and multi-actor stakeholder engagement, resulting in the emergence of consensus plans for precinct-level urban regeneration capable of more rapid implementation. Using Melbourne, Australia as a case study, this paper addresses two of the urban intervention challenges – where and how – via the application of a 21st century planning tool ENVISION created for this purpose.

Keywords: Green urbanism, greyfields, planning tools, urban regeneration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3122

13357 Are XBRL-based Financial Reports Better than Non-XBRL Reports? A Quality Assessment

Authors: Zhenkun Wang, Simon S. Gao

Abstract:

Using a scoring system, this paper provides a comparative assessment of the quality of data between XBRL formatted financial reports and non-XBRL financial reports. It shows a major improvement in the quality of data of XBRL formatted financial reports. Although XBRL formatted financial reports do not show much advantage in the quality at the beginning, XBRL financial reports lately display a large improvement in the quality of data in almost all aspects. With the improved XBRL web data managing, presentation and analysis applications, XBRL formatted financial reports have a much better accessibility, are more accurate and better in timeliness.

Keywords: Data Quality; Financial Report; Information; XBRL

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2565

13356 Time Series Simulation by Conditional Generative Adversarial Net

Authors: Rao Fu, Jie Chen, Shutian Zeng, Yiping Zhuang, Agus Sudjianto

Abstract:

Generative Adversarial Net (GAN) has proved to be a powerful machine learning tool in image data analysis and generation. In this paper, we propose to use Conditional Generative Adversarial Net (CGAN) to learn and simulate time series data. The conditions include both categorical and continuous variables with different auxiliary information. Our simulation studies show that CGAN has the capability to learn different types of normal and heavy-tailed distributions, as well as dependent structures of different time series. It also has the capability to generate conditional predictive distributions consistent with training data distributions. We also provide an in-depth discussion on the rationale behind GAN and the neural networks as hierarchical splines to establish a clear connection with existing statistical methods of distribution generation. In practice, CGAN has a wide range of applications in market risk and counterparty risk analysis: it can be applied to learn historical data and generate scenarios for the calculation of Value-at-Risk (VaR) and Expected Shortfall (ES), and it can also predict the movement of the market risk factors. We present a real data analysis including a backtesting to demonstrate that CGAN can outperform Historical Simulation (HS), a popular method in market risk analysis to calculate VaR. CGAN can also be applied in economic time series modeling and forecasting. In this regard, we have included an example of hypothetical shock analysis for economic models and the generation of potential CCAR scenarios by CGAN at the end of the paper.

Keywords: Conditional Generative Adversarial Net, market and credit risk management, neural network, time series.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1197

13355 AniMoveMineR: Animal Behavior Exploratory Analysis Using Association Rules Mining

Authors: Suelane Garcia Fontes, Silvio Luiz Stanzani, Pedro L. Pizzigatti Corrła Ronaldo G. Morato

Abstract:

Environmental changes and major natural disasters are most prevalent in the world due to the damage that humanity has caused to nature and these damages directly affect the lives of animals. Thus, the study of animal behavior and their interactions with the environment can provide knowledge that guides researchers and public agencies in preservation and conservation actions. Exploratory analysis of animal movement can determine the patterns of animal behavior and with technological advances the ability of animals to be tracked and, consequently, behavioral studies have been expanded. There is a lot of research on animal movement and behavior, but we note that a proposal that combines resources and allows for exploratory analysis of animal movement and provide statistical measures on individual animal behavior and its interaction with the environment is missing. The contribution of this paper is to present the framework AniMoveMineR, a unified solution that aggregates trajectory analysis and data mining techniques to explore animal movement data and provide a first step in responding questions about the animal individual behavior and their interactions with other animals over time and space. We evaluated the framework through the use of monitored jaguar data in the city of Miranda Pantanal, Brazil, in order to verify if the use of AniMoveMineR allows to identify the interaction level between these jaguars. The results were positive and provided indications about the individual behavior of jaguars and about which jaguars have the highest or lowest correlation.

Keywords: Data mining, data science, trajectory, animal behavior.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 916

13354 An Automatic Tool for Checking Consistency between Data Flow Diagrams (DFDs)

Authors: Rosziati Ibrahim, Siow Yen Yen

Abstract:

System development life cycle (SDLC) is a process uses during the development of any system. SDLC consists of four main phases: analysis, design, implement and testing. During analysis phase, context diagram and data flow diagrams are used to produce the process model of a system. A consistency of the context diagram to lower-level data flow diagrams is very important in smoothing up developing process of a system. However, manual consistency check from context diagram to lower-level data flow diagrams by using a checklist is time-consuming process. At the same time, the limitation of human ability to validate the errors is one of the factors that influence the correctness and balancing of the diagrams. This paper presents a tool that automates the consistency check between Data Flow Diagrams (DFDs) based on the rules of DFDs. The tool serves two purposes: as an editor to draw the diagrams and as a checker to check the correctness of the diagrams drawn. The consistency check from context diagram to lower-level data flow diagrams is embedded inside the tool to overcome the manual checking problem.

Keywords: Data Flow Diagram, Context Diagram, ConsistencyCheck, Syntax and Semantic Rules

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3438

13353 Garden Culture in Islamic Civilization: A Glance at the Birth, Development and Current Situation

Authors: Parisa Göker

Abstract:

With the birth of Islam, the definitions of paradise in Quran have spread across three continents since 7^th century, showing itself in the palace gardens as a reflection of Islamic Culture. The design characteristics of Islamic gardens come forth with the influence of religious beliefs, as well as taking its form as per the cultural, climatic and soil characteristics of its geography, and showing its difference. It is possible to see these differences from the garden examples that survived to present time from the civilizations in the lands of Islamic proliferation. The main material of this research is the Islamic gardens in Iran and Spain. Field study was carried out in Alhambra Palace in Spain, Granada and Shah Goli garden in Iran, Tabriz. In this study, the birth of Islamic gardens, spatial perception of paradise, design principles, spatial structure, along with the structural/plantation materials used are examined. Also the characteristics and differentiation of the gardens examined in different cultures and geographies have been revealed. In the conclusion section, Iran and Spain Islamic garden samples were evaluated and their properties were determined.

Keywords: Islamic civilization, Islamic architecture, cultural landscape, Islamic garden.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1271

13352 Slug Tracking Simulation of Severe Slugging Experiments

Authors: Tor Kindsbekken Kjeldby, Ruud Henkes, Ole Jørgen Nydal

Abstract:

Experimental data from an atmospheric air/water terrain slugging case has been made available by the Shell Amsterdam research center, and has been subject to numerical simulation and comparison with a one-dimensional two-phase slug tracking simulator under development at the Norwegian University of Science and Technology. The code is based on tracking of liquid slugs in pipelines by use of a Lagrangian grid formulation implemented in Cµ by use of object oriented techniques. An existing hybrid spatial discretization scheme is tested, in which the stratified regions are modelled by the two-fluid model. The slug regions are treated incompressible, thus requiring a single momentum balance over the whole slug. Upon comparison with the experimental data, the period of the simulated severe slugging cycle is observed to be sensitive to slug generation in the horizontal parts of the system. Two different slug initiation methods have been tested with the slug tracking code, and grid dependency has been investigated.

Keywords: Hydrodynamic initiation, slug tracking, terrain slugging, two-fluid model, two-phase flow.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3220

13351 Fault Detection and Identification of COSMED K4b2 Based On PCA and Neural Network

Authors: Jing Zhou, Steven Su, Aihuang Guo

Abstract:

COSMED K4b2 is a portable electrical device designed to test pulmonary functions. It is ideal for many applications that need the measurement of the cardio-respiratory response either in the field or in the lab is capable with the capability to delivery real time data to a sink node or a PC base station with storing data in the memory at the same time. But the actual sensor outputs and data received may contain some errors, such as impulsive noise which can be related to sensors, low batteries, environment or disturbance in data acquisition process. These abnormal outputs might cause misinterpretations of exercise or living activities to persons being monitored. In our paper we propose an effective and feasible method to detect and identify errors in applications by principal component analysis (PCA) and a back propagation (BP) neural network.

Keywords: BP Neural Network, Exercising Testing, Fault Detection and Identification, Principal Component Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3074

13350 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: Data mining, knowledge discovery, machine learning, similarity measurement, supervised classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1526

13349 The Robust Clustering with Reduction Dimension

Authors: Dyah E. Herwindiati

Abstract:

A clustering is process to identify a homogeneous groups of object called as cluster. Clustering is one interesting topic on data mining. A group or class behaves similarly characteristics. This paper discusses a robust clustering process for data images with two reduction dimension approaches; i.e. the two dimensional principal component analysis (2DPCA) and principal component analysis (PCA). A standard approach to overcome this problem is dimension reduction, which transforms a high-dimensional data into a lower-dimensional space with limited loss of information. One of the most common forms of dimensionality reduction is the principal components analysis (PCA). The 2DPCA is often called a variant of principal component (PCA), the image matrices were directly treated as 2D matrices; they do not need to be transformed into a vector so that the covariance matrix of image can be constructed directly using the original image matrices. The decomposed classical covariance matrix is very sensitive to outlying observations. The objective of paper is to compare the performance of robust minimizing vector variance (MVV) in the two dimensional projection PCA (2DPCA) and the PCA for clustering on an arbitrary data image when outliers are hiden in the data set. The simulation aspects of robustness and the illustration of clustering images are discussed in the end of paper

Keywords: Breakdown point, Consistency, 2DPCA, PCA, Outlier, Vector Variance

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1696

13348 Operational Risk – Scenario Analysis

Authors: Milan Rippel, Petr Teply

Abstract:

This paper focuses on operational risk measurement techniques and on economic capital estimation methods. A data sample of operational losses provided by an anonymous Central European bank is analyzed using several approaches. Loss Distribution Approach and scenario analysis method are considered. Custom plausible loss events defined in a particular scenario are merged with the original data sample and their impact on capital estimates and on the financial institution is evaluated. Two main questions are assessed – What is the most appropriate statistical method to measure and model operational loss data distribution? and What is the impact of hypothetical plausible events on the financial institution? The g&h distribution was evaluated to be the most suitable one for operational risk modeling. The method based on the combination of historical loss events modeling and scenario analysis provides reasonable capital estimates and allows for the measurement of the impact of extreme events on banking operations.

Keywords: operational risk, scenario analysis, economic capital, loss distribution approach, extreme value theory, stress testing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2426

13347 Geo-Spatial Methods to Better Understand Urban Food Deserts

Authors: Brian Ceh, Alison Jackson-Holland

Abstract:

Food deserts are a reality in some cities. These deserts can be described as a shortage of healthy food options within close proximity of consumers. The shortage in this case is typically facilitated by a lack of stores in an urban area that provide adequate fruit and vegetable choices. This study explores new avenues to better understand food deserts by examining modes of transportation that are available to shoppers or consumers, e.g. walking, automobile, or public transit. Further, this study is unique in that it not only explores the location of large grocery stores, but small grocery and convenience stores too. In this study, the relationship between some socio-economic indicators, such as personal income, are also explored to determine any possible association with food deserts. In addition, to help facilitate our understanding of food deserts, complex network spatial models that are built on adequate algorithms are used to investigate the possibility of food deserts in the city of Hamilton, Canada. It is found that Hamilton, Canada is adequate serviced by retailers who provide healthy food choices and that the food desert phenomena is almost absent.

Keywords: Canada, desert, food, Hamilton, stores.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1297

13346 Benchmarking Cleaner Production Performance of Coal-fired Power Plants Using Two-stage Super-efficiency Data Envelopment Analysis

Authors: Shao-lun Zeng, Yu-long Ren

Abstract:

Benchmarking cleaner production performance is an effective way of pollution control and emission reduction in coal-fired power industry. A benchmarking method using two-stage super-efficiency data envelopment analysis for coal-fired power plants is proposed – firstly, to improve the cleaner production performance of DEA-inefficient or weakly DEA-efficient plants, then to select the benchmark from performance-improved power plants. An empirical study is carried out with the survey data of 24 coal-fired power plants. The result shows that in the first stage the performance of 16 plants is DEA-efficient and that of 8 plants is relatively inefficient. The target values for improving DEA-inefficient plants are acquired by projection analysis. The efficient performance of 24 power plants and the benchmarking plant is achieved in the second stage. The two-stage benchmarking method is practical to select the optimal benchmark in the cleaner production of coal-fired power industry and will continuously improve plants- cleaner production performance.

Keywords: benchmarking, cleaner production performance, coal-fired power plant, super-efficiency data envelopment analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2430

13345 Methods for Distinction of Cattle Using Supervised Learning

Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl

Abstract:

Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.

Keywords: Genetic data, Pinzgau cattle, supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2317

13344 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2479

13343 Energy Efficiency Analysis of Crossover Technologies in Industrial Applications

Authors: W. Schellong

Abstract:

Industry accounts for one-third of global final energy demand. Crossover technologies (e.g. motors, pumps, process heat, and air conditioning) play an important role in improving energy efficiency. These technologies are used in many applications independent of the production branch. Especially electrical power is used by drives, pumps, compressors, and lightning. The paper demonstrates the algorithm of the energy analysis by some selected case studies for typical industrial processes. The energy analysis represents an essential part of energy management systems (EMS). Generally, process control system (PCS) can support EMS. They provide information about the production process, and they organize the maintenance actions. Combining these tools into an integrated process allows the development of an energy critical equipment strategy. Thus, asset and energy management can use the same common data to improve the energy efficiency.

Keywords: Crossover technologies, data management, energy analysis, energy efficiency, process control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 960

13342 Acute Coronary Syndrome Prediction Using Data Mining Techniques- An Application

Authors: Tahseen A. Jilani, Huda Yasin, Madiha Yasin, C. Ardil

Abstract:

In this paper we use data mining techniques to investigate factors that contribute significantly to enhancing the risk of acute coronary syndrome. We assume that the dependent variable is diagnosis – with dichotomous values showing presence or absence of disease. We have applied binary regression to the factors affecting the dependent variable. The data set has been taken from two different cardiac hospitals of Karachi, Pakistan. We have total sixteen variables out of which one is assumed dependent and other 15 are independent variables. For better performance of the regression model in predicting acute coronary syndrome, data reduction techniques like principle component analysis is applied. Based on results of data reduction, we have considered only 14 out of sixteen factors.

Keywords: Acute coronary syndrome (ACS), binary logistic regression analyses, myocardial ischemia (MI), principle component analysis, unstable angina (U.A.).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2112

13341 Multiple Approaches for Ultrasonic Cavitation Monitoring of Oxygen-Loaded Nanodroplets

Authors: Simone Galati, Adriano Troia

Abstract:

Ultrasound (US) is widely used in medical field for a variety diagnostic techniques but, in recent years, it has also been creating great interest for therapeutic aims. Regarding drug delivery, the use of US as an activation source provides better spatial delivery confinement and limits the undesired side effects. However, at present there is no complete characterization at a fundamental level of the different signals produced by sono-activated nanocarriers. Therefore, the aim of this study is to obtain a metrological characterization of the cavitation phenomena induced by US through three parallel investigation approaches. US was focused into a channel of a customized phantom in which a solution with oxygen-loaded nanodroplets (OLNDs) was led to flow and the cavitation activity was monitored. Both quantitative and qualitative real-time analysis were performed giving information about the dynamics of bubble formation, oscillation and final implosion with respect to the working acoustic pressure and the type of nanodroplets, compared with pure water. From this analysis a possible interpretation of the observed results is proposed.

Keywords: Cavitation, Drug Delivery, Nanodroplets, Ultrasound.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 601