Search results for: biological data
7426 A Comparison of Image Data Representations for Local Stereo Matching
Authors: André Smith, Amr Abdel-Dayem
Abstract:
The stereo matching problem, while having been present for several decades, continues to be an active area of research. The goal of this research is to find correspondences between elements found in a set of stereoscopic images. With these pairings, it is possible to infer the distance of objects within a scene, relative to the observer. Advancements in this field have led to experimentations with various techniques, from graph-cut energy minimization to artificial neural networks. At the basis of these techniques is a cost function, which is used to evaluate the likelihood of a particular match between points in each image. While at its core, the cost is based on comparing the image pixel data; there is a general lack of consistency as to what image data representation to use. This paper presents an experimental analysis to compare the effectiveness of more common image data representations. The goal is to determine the effectiveness of these data representations to reduce the cost for the correct correspondence relative to other possible matches.Keywords: Colour data, local stereo matching, stereo correspondence, disparity map.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9197425 Flexible, Adaptable and Scaleable Business Rules Management System for Data Validation
Authors: Kashif Kamran, Farooque Azam
Abstract:
The policies governing the business of any organization are well reflected in her business rules. The business rules are implemented by data validation techniques, coded during the software development process. Any change in business policies results in change in the code written for data validation used to enforce the business policies. Implementing the change in business rules without changing the code is the objective of this paper. The proposed approach enables users to create rule sets at run time once the software has been developed. The newly defined rule sets by end users are associated with the data variables for which the validation is required. The proposed approach facilitates the users to define business rules using all the comparison operators and Boolean operators. Multithreading is used to validate the data entered by end user against the business rules applied. The evaluation of the data is performed by a newly created thread using an enhanced form of the RPN (Reverse Polish Notation) algorithm.Keywords: Business Rules, data validation, multithreading, Reverse Polish Notation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22717424 Influential Effect of Self-Healing Treatment on Water Absorption and Electrical Resistance of Normal and Light Weight Aggregate Concretes
Authors: B. Tayebani, N. Hosseinibalam, D. Mostofinejad
Abstract:
Interest in using bacteria in cement materials due to its positive influences has been increased. Cement materials such as mortar and concrete basically suffer from higher porosity and water absorption compared to other building materials such as steel materials. Because of the negative side-effects of certain chemical techniques, biological methods have been proposed as a desired and environmentally friendly strategy for reducing concrete porosity and diminishing water absorption. This paper presents the results of an experimental investigation carried out to evaluate the influence of Sporosarcina pasteurii bacteria on the behaviour of two types of concretes (light weight aggregate concrete and normal weight concrete). The resistance of specimens to water penetration by testing water absorption and evaluating the electrical resistance of those concretes was examined and compared. As a conclusion, 20% increase in electrical resistance and 10% reduction in water absorption of lightweight aggregate concrete (LWAC) and for normal concrete the results show 7% decrease in water absorption and almost 10% increase in electrical resistance.
Keywords: Bacteria, biological method, normal weight concrete, lightweight aggregate concrete, water absorption, electrical resistance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10107423 Tidal Data Analysis using ANN
Authors: Ritu Vijay, Rekha Govil
Abstract:
The design of a complete expansion that allows for compact representation of certain relevant classes of signals is a central problem in signal processing applications. Achieving such a representation means knowing the signal features for the purpose of denoising, classification, interpolation and forecasting. Multilayer Neural Networks are relatively a new class of techniques that are mathematically proven to approximate any continuous function arbitrarily well. Radial Basis Function Networks, which make use of Gaussian activation function, are also shown to be a universal approximator. In this age of ever-increasing digitization in the storage, processing, analysis and communication of information, there are numerous examples of applications where one needs to construct a continuously defined function or numerical algorithm to approximate, represent and reconstruct the given discrete data of a signal. Many a times one wishes to manipulate the data in a way that requires information not included explicitly in the data, which is done through interpolation and/or extrapolation. Tidal data are a very perfect example of time series and many statistical techniques have been applied for tidal data analysis and representation. ANN is recent addition to such techniques. In the present paper we describe the time series representation capabilities of a special type of ANN- Radial Basis Function networks and present the results of tidal data representation using RBF. Tidal data analysis & representation is one of the important requirements in marine science for forecasting.Keywords: ANN, RBF, Tidal Data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16577422 Spatial Data Mining by Decision Trees
Authors: S. Oujdi, H. Belbachir
Abstract:
Existing methods of data mining cannot be applied on spatial data because they require spatial specificity consideration, as spatial relationships. This paper focuses on the classification with decision trees, which are one of the data mining techniques. We propose an extension of the C4.5 algorithm for spatial data, based on two different approaches Join materialization and Querying on the fly the different tables. Similar works have been done on these two main approaches, the first - Join materialization - favors the processing time in spite of memory space, whereas the second - Querying on the fly different tables- promotes memory space despite of the processing time. The modified C4.5 algorithm requires three entries tables: a target table, a neighbor table, and a spatial index join that contains the possible spatial relationship among the objects in the target table and those in the neighbor table. Thus, the proposed algorithms are applied to a spatial data pattern in the accidentology domain. A comparative study of our approach with other works of classification by spatial decision trees will be detailed.
Keywords: C4.5 Algorithm, Decision trees, S-CART, Spatial data mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 29887421 Affine Projection Algorithm with Variable Data-Reuse Factor
Authors: ChangWoo Lee, Young Kow Lee, Sung Jun Ban, SungHoo Choi, Sang Woo Kim
Abstract:
This paper suggests a new Affine Projection (AP) algorithm with variable data-reuse factor using the condition number as a decision factor. To reduce computational burden, we adopt a recently reported technique which estimates the condition number of an input data matrix. Several simulations show that the new algorithm has better performance than that of the conventional AP algorithm.
Keywords: Affine projection algorithm, variable data-reuse factor, condition number, convergence rate, misalignment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15467420 Monitoring of Water Pollution and Its Consequences: An Overview
Authors: N. Singh, N. Sharma, J. K. Katnoria
Abstract:
Water a vital component for all living forms is derived from variety of sources, including surface water (rivers, lakes, reservoirs and ponds) and ground water (aquifers). Over the years of time, water bodies are subjected to human interference regularly resulting in deterioration of water quality. Therefore, pollution of water bodies has become matter of global concern. As the water quality closely relate to human health, water analysis before usage is of immense importance. Improper management of water bodies can cause serious problems in availability and quality of water. The quality of water may be described according to their physico-chemical and microbiological characteristics. For effective maintenance of water quality through appropriate control measures, continuous monitoring of metals, physico-chemical and biological parameter is essential for the establishment of baseline data for the water quality in any study area. The present study has focused on to explore the status of water pollution in various areas and to estimate the magnitude of its toxicity using different bioassay.
Keywords: Genotoxicity, Heavy metals, Mutagenicity, Physico-chemical analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 35587419 Attribute Analysis of Quick Response Code Payment Users Using Discriminant Non-negative Matrix Factorization
Authors: Hironori Karachi, Haruka Yamashita
Abstract:
Recently, the system of quick response (QR) code is getting popular. Many companies introduce new QR code payment services and the services are competing with each other to increase the number of users. For increasing the number of users, we should grasp the difference of feature of the demographic information, usage information, and value of users between services. In this study, we conduct an analysis of real-world data provided by Nomura Research Institute including the demographic data of users and information of users’ usages of two services; LINE Pay, and PayPay. For analyzing such data and interpret the feature of them, Nonnegative Matrix Factorization (NMF) is widely used; however, in case of the target data, there is a problem of the missing data. EM-algorithm NMF (EMNMF) to complete unknown values for understanding the feature of the given data presented by matrix shape. Moreover, for comparing the result of the NMF analysis of two matrices, there is Discriminant NMF (DNMF) shows the difference of users features between two matrices. In this study, we combine EMNMF and DNMF and also analyze the target data. As the interpretation, we show the difference of the features of users between LINE Pay and Paypay.
Keywords: Data science, non-negative matrix factorization, missing data, quality of services.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4577418 Using Data Mining for Learning and Clustering FCM
Authors: Somayeh Alizadeh, Mehdi Ghazanfari, Mohammad Fathian
Abstract:
Fuzzy Cognitive Maps (FCMs) have successfully been applied in numerous domains to show relations between essential components. In some FCM, there are more nodes, which related to each other and more nodes means more complex in system behaviors and analysis. In this paper, a novel learning method used to construct FCMs based on historical data and by using data mining and DEMATEL method, a new method defined to reduce nodes number. This method cluster nodes in FCM based on their cause and effect behaviors.Keywords: Clustering, Data Mining, Fuzzy Cognitive Map(FCM), Learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20187417 Modeling Low Voltage Power Line as a Data Communication Channel
Authors: Eklas Hossain, Sheroz Khan, Ahad Ali
Abstract:
Power line communications may be used as a data communication channel in public and indoor distribution networks so that it does not require the installing of new cables. Industrial low voltage distribution network may be utilized for data transfer required by the on-line condition monitoring of electric motors. This paper presents a pilot distribution network for modeling low voltage power line as data transfer channel. The signal attenuation in communication channels in the pilot environment is presented and the analysis is done by varying the corresponding parameters for the signal attenuation.Keywords: Data communication, indoor distribution networks, low voltage, power line.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32857416 Generating Concept Trees from Dynamic Self-organizing Map
Authors: Norashikin Ahmad, Damminda Alahakoon
Abstract:
Self-organizing map (SOM) provides both clustering and visualization capabilities in mining data. Dynamic self-organizing maps such as Growing Self-organizing Map (GSOM) has been developed to overcome the problem of fixed structure in SOM to enable better representation of the discovered patterns. However, in mining large datasets or historical data the hierarchical structure of the data is also useful to view the cluster formation at different levels of abstraction. In this paper, we present a technique to generate concept trees from the GSOM. The formation of tree from different spread factor values of GSOM is also investigated and the quality of the trees analyzed. The results show that concept trees can be generated from GSOM, thus, eliminating the need for re-clustering of the data from scratch to obtain a hierarchical view of the data under study.
Keywords: dynamic self-organizing map, concept formation, clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14617415 Optical Fiber Data Throughput in a Quantum Communication System
Authors: Arash Kosari, Ali Araghi
Abstract:
A mathematical model for an optical-fiber communication channel is developed which results in an expression that calculates the throughput and loss of the corresponding link. The data are assumed to be transmitted by using of separate photons with different polarizations. The derived model also shows the dependency of data throughput with length of the channel and depolarization factor. It is observed that absorption of photons affects the throughput in a more intensive way in comparison with that of depolarization. Apart from that, the probability of depolarization and the absorption of radiated photons are obtained.Keywords: Absorption, data throughput, depolarization, optical fiber.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16867414 Fuzzy Clustering of Categorical Attributes and its Use in Analyzing Cultural Data
Authors: George E. Tsekouras, Dimitris Papageorgiou, Sotiris Kotsiantis, Christos Kalloniatis, Panagiotis Pintelas
Abstract:
We develop a three-step fuzzy logic-based algorithm for clustering categorical attributes, and we apply it to analyze cultural data. In the first step the algorithm employs an entropy-based clustering scheme, which initializes the cluster centers. In the second step we apply the fuzzy c-modes algorithm to obtain a fuzzy partition of the data set, and the third step introduces a novel cluster validity index, which decides the final number of clusters.
Keywords: Categorical data, cultural data, fuzzy logic clustering, fuzzy c-modes, cluster validity index.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17097413 Weighted Data Replication Strategy for Data Grid Considering Economic Approach
Authors: N. Mansouri, A. Asadi
Abstract:
Data Grid is a geographically distributed environment that deals with data intensive application in scientific and enterprise computing. Data replication is a common method used to achieve efficient and fault-tolerant data access in Grids. In this paper, a dynamic data replication strategy, called Enhanced Latest Access Largest Weight (ELALW) is proposed. This strategy is an enhanced version of Latest Access Largest Weight strategy. However, replication should be used wisely because the storage capacity of each Grid site is limited. Thus, it is important to design an effective strategy for the replication replacement task. ELALW replaces replicas based on the number of requests in future, the size of the replica, and the number of copies of the file. It also improves access latency by selecting the best replica when various sites hold replicas. The proposed replica selection selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue and the distance between nodes. Simulation results utilizing the OptorSim show our replication strategy achieve better performance overall than other strategies in terms of job execution time, effective network usage and storage resource usage.
Keywords: Data grid, data replication, simulation, replica selection, replica placement.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21107412 A Proposal of an Automatic Formatting Method for Transforming XML Data
Authors: Zhe JIN, Motomichi TOYAMA
Abstract:
PPX(Pretty Printer for XML) is a query language that offers a concise description method of formatting the XML data into HTML. In this paper, we propose a simple specification of formatting method that is a combination description of automatic layout operators and variables in the layout expression of the GENERATE clause of PPX. This method can automatically format irregular XML data included in a part of XML with layout decision rule that is referred to DTD. In the experiment, a quick comparison shows that PPX requires far less description compared to XSLT or XQuery programs doing same tasks.
Keywords: PPX, Irregular XML data, Layout decision rule, HTML.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14187411 Data Mining in Oral Medicine Using Decision Trees
Authors: Fahad Shahbaz Khan, Rao Muhammad Anwer, Olof Torgersson, Göran Falkman
Abstract:
Data mining has been used very frequently to extract hidden information from large databases. This paper suggests the use of decision trees for continuously extracting the clinical reasoning in the form of medical expert-s actions that is inherent in large number of EMRs (Electronic Medical records). In this way the extracted data could be used to teach students of oral medicine a number of orderly processes for dealing with patients who represent with different problems within the practice context over time.Keywords: Data mining, Oral Medicine, Decision Trees, WEKA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25057410 An Efficient Data Collection Approach for Wireless Sensor Networks
Authors: Hanieh Alipour, Alireza Nemaney Pour
Abstract:
One of the most important applications of wireless sensor networks is data collection. This paper proposes as efficient approach for data collection in wireless sensor networks by introducing Member Forward List. This list includes the nodes with highest priority for forwarding the data. When a node fails or dies, this list is used to select the next node with higher priority. The benefit of this node is that it prevents the algorithm from repeating when a node fails or dies. The results show that Member Forward List decreases power consumption and latency in wireless sensor networks.Keywords: Data Collection, Wireless Sensor Network, SensorNode, Tree-Based
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24107409 A Modified Fuzzy C-Means Algorithm for Natural Data Exploration
Authors: Binu Thomas, Raju G., Sonam Wangmo
Abstract:
In Data mining, Fuzzy clustering algorithms have demonstrated advantage over crisp clustering algorithms in dealing with the challenges posed by large collections of vague and uncertain natural data. This paper reviews concept of fuzzy logic and fuzzy clustering. The classical fuzzy c-means algorithm is presented and its limitations are highlighted. Based on the study of the fuzzy c-means algorithm and its extensions, we propose a modification to the cmeans algorithm to overcome the limitations of it in calculating the new cluster centers and in finding the membership values with natural data. The efficiency of the new modified method is demonstrated on real data collected for Bhutan-s Gross National Happiness (GNH) program.Keywords: Adaptive fuzzy clustering, clustering, fuzzy logic, fuzzy clustering, c-means.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19927408 Constraint Based Frequent Pattern Mining Technique for Solving GCS Problem
Authors: First G.M. Karthik, Second Ramachandra.V.Pujeri, Dr.
Abstract:
Generalized Center String (GCS) problem are generalized from Common Approximate Substring problem and Common substring problems. GCS are known to be NP-hard allowing the problems lies in the explosion of potential candidates. Finding longest center string without concerning the sequence that may not contain any motifs is not known in advance in any particular biological gene process. GCS solved by frequent pattern-mining techniques and known to be fixed parameter tractable based on the fixed input sequence length and symbol set size. Efficient method known as Bpriori algorithms can solve GCS with reasonable time/space complexities. Bpriori 2 and Bpriori 3-2 algorithm are been proposed of any length and any positions of all their instances in input sequences. In this paper, we reduced the time/space complexity of Bpriori algorithm by Constrained Based Frequent Pattern mining (CBFP) technique which integrates the idea of Constraint Based Mining and FP-tree mining. CBFP mining technique solves the GCS problem works for all center string of any length, but also for the positions of all their mutated copies of input sequence. CBFP mining technique construct TRIE like with FP tree to represent the mutated copies of center string of any length, along with constraints to restraint growth of the consensus tree. The complexity analysis for Constrained Based FP mining technique and Bpriori algorithm is done based on the worst case and average case approach. Algorithm's correctness compared with the Bpriori algorithm using artificial data is shown.Keywords: Constraint Based Mining, FP tree, Data mining, GCS problem, CBFP mining technique.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17037407 Real-Time Visualization Using GPU-Accelerated Filtering of LiDAR Data
Authors: Sašo Pečnik, Borut Žalik
Abstract:
This paper presents a real-time visualization technique and filtering of classified LiDAR point clouds. The visualization is capable of displaying filtered information organized in layers by the classification attribute saved within LiDAR datasets. We explain the used data structure and data management, which enables real-time presentation of layered LiDAR data. Real-time visualization is achieved with LOD optimization based on the distance from the observer without loss of quality. The filtering process is done in two steps and is entirely executed on the GPU and implemented using programmable shaders.
Keywords: Filtering, graphics, level-of-details, LiDAR, realtime visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25467406 Business-Intelligence Mining of Large Decentralized Multimedia Datasets with a Distributed Multi-Agent System
Authors: Karima Qayumi, Alex Norta
Abstract:
The rapid generation of high volume and a broad variety of data from the application of new technologies pose challenges for the generation of business-intelligence. Most organizations and business owners need to extract data from multiple sources and apply analytical methods for the purposes of developing their business. Therefore, the recently decentralized data management environment is relying on a distributed computing paradigm. While data are stored in highly distributed systems, the implementation of distributed data-mining techniques is a challenge. The aim of this technique is to gather knowledge from every domain and all the datasets stemming from distributed resources. As agent technologies offer significant contributions for managing the complexity of distributed systems, we consider this for next-generation data-mining processes. To demonstrate agent-based business intelligence operations, we use agent-oriented modeling techniques to develop a new artifact for mining massive datasets.
Keywords: Agent-oriented modeling, business Intelligence management, distributed data mining, multi-agent system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13757405 Abnormal IP Packets on 3G Mobile Data Networks
Authors: Joo-Hyung Oh, Dongwan Kang, JunHyung Cho, Chaetae Im
Abstract:
As the mobile Internet has become widespread in recent years, communication based on mobile networks is increasing. As a result, security threats have been posed with regard to the abnormal traffic of mobile networks, but mobile security has been handled with focus on threats posed by mobile malicious codes, and researches on security threats to the mobile network itself have not attracted much attention. In mobile networks, the IP address of the data packet is a very important factor for billing purposes. If one mobile terminal use an incorrect IP address that either does not exist or could be assigned to another mobile terminal, billing policy will cause problems. We monitor and analyze 3G mobile data networks traffics for a period of time and finds some abnormal IP packets. In this paper, we analyze the reason for abnormal IP packets on 3G Mobile Data Networks. And we also propose an algorithm based on IP address table that contains addresses currently in use within the mobile data network to detect abnormal IP packets.
Keywords: WCDMA, 3G, Abnormal IP address, Mobile Data Network Attack
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23387404 Ontology for a Voice Transcription of OpenStreetMap Data: The Case of Space Apprehension by Visually Impaired Persons
Authors: Said Boularouk, Didier Josselin, Eitan Altman
Abstract:
In this paper, we present a vocal ontology of OpenStreetMap data for the apprehension of space by visually impaired people. Indeed, the platform based on produsage gives a freedom to data producers to choose the descriptors of geocoded locations. Unfortunately, this freedom, called also folksonomy leads to complicate subsequent searches of data. We try to solve this issue in a simple but usable method to extract data from OSM databases in order to send them to visually impaired people using Text To Speech technology. We focus on how to help people suffering from visual disability to plan their itinerary, to comprehend a map by querying computer and getting information about surrounding environment in a mono-modal human-computer dialogue.Keywords: Ontology, OpenStreetMap, visually impaired people, TTS, taxonomy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8897403 A Data Mining Model for Detecting Financial and Operational Risk Indicators of SMEs
Authors: Ali Serhan Koyuncugil, Nermin Ozgulbas
Abstract:
In this paper, a data mining model to SMEs for detecting financial and operational risk indicators by data mining is presenting. The identification of the risk factors by clarifying the relationship between the variables defines the discovery of knowledge from the financial and operational variables. Automatic and estimation oriented information discovery process coincides the definition of data mining. During the formation of model; an easy to understand, easy to interpret and easy to apply utilitarian model that is far from the requirement of theoretical background is targeted by the discovery of the implicit relationships between the data and the identification of effect level of every factor. In addition, this paper is based on a project which was funded by The Scientific and Technological Research Council of Turkey (TUBITAK).
Keywords: Risk Management, Financial Risk, Operational Risk, Financial Early Warning System, Data Mining, CHAID Decision Tree Algorithm, SMEs.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 31257402 Satellite Data Classification Accuracy Assessment Based from Reference Dataset
Authors: Mohd Hasmadi Ismail, Kamaruzaman Jusoff
Abstract:
In order to develop forest management strategies in tropical forest in Malaysia, surveying the forest resources and monitoring the forest area affected by logging activities is essential. There are tremendous effort has been done in classification of land cover related to forest resource management in this country as it is a priority in all aspects of forest mapping using remote sensing and related technology such as GIS. In fact classification process is a compulsory step in any remote sensing research. Therefore, the main objective of this paper is to assess classification accuracy of classified forest map on Landsat TM data from difference number of reference data (200 and 388 reference data). This comparison was made through observation (200 reference data), and interpretation and observation approaches (388 reference data). Five land cover classes namely primary forest, logged over forest, water bodies, bare land and agricultural crop/mixed horticultural can be identified by the differences in spectral wavelength. Result showed that an overall accuracy from 200 reference data was 83.5 % (kappa value 0.7502459; kappa variance 0.002871), which was considered acceptable or good for optical data. However, when 200 reference data was increased to 388 in the confusion matrix, the accuracy slightly improved from 83.5% to 89.17%, with Kappa statistic increased from 0.7502459 to 0.8026135, respectively. The accuracy in this classification suggested that this strategy for the selection of training area, interpretation approaches and number of reference data used were importance to perform better classification result.Keywords: Image Classification, Reference Data, Accuracy Assessment, Kappa Statistic, Forest Land Cover
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 31427401 Biological Hotspots in the Galápagos Islands: Exploring Seasonal Trends of Ocean Climate Drivers to Monitor Algal Blooms
Authors: Emily Kislik, Gabriel Mantilla Saltos, Gladys Torres, Mercy Borbor-Córdova
Abstract:
The Galápagos Marine Reserve (GMR) is an internationally-recognized region of consistent upwelling events, high productivity, and rich biodiversity. Despite its high-nutrient, low-chlorophyll condition, the archipelago has experienced phytoplankton blooms, especially in the western section between Isabela and Fernandina Islands. However, little is known about how climate variability will affect future phytoplankton standing stock in the Galápagos, and no consistent protocols currently exist to quantify phytoplankton biomass, identify species, or monitor for potential harmful algal blooms (HABs) within the archipelago. This analysis investigates physical, chemical, and biological oceanic variables that contribute to algal blooms within the GMR, using 4 km Aqua MODIS satellite imagery and 0.125-degree wind stress data from January 2003 to December 2016. Furthermore, this study analyzes chlorophyll-a concentrations at varying spatial scales— within the greater archipelago, as well as within five smaller bioregions based on species biodiversity in the GMR. Seasonal and interannual trend analyses, correlations, and hotspot identification were performed. Results demonstrate that chlorophyll-a is expressed in two seasons throughout the year in the GMR, most frequently in September and March, with a notable hotspot in the Elizabeth Bay bioregion. Interannual chlorophyll-a trend analyses revealed highest peaks in 2003, 2007, 2013, and 2016, and variables that correlate highly with chlorophyll-a include surface temperature and particulate organic carbon. This study recommends future in situ sampling locations for phytoplankton monitoring, including the Elizabeth Bay bioregion. Conclusions from this study contribute to the knowledge of oceanic drivers that catalyze primary productivity and consequently affect species biodiversity within the GMR. Additionally, this research can inform policy and decision-making strategies for species conservation and management within bioregions of the Galápagos.
Keywords: Bioregions, ecological monitoring, phytoplankton, remote sensing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13857400 Markov Chain Monte Carlo Model Composition Search Strategy for Quantitative Trait Loci in a Bayesian Hierarchical Model
Authors: Susan J. Simmons, Fang Fang, Qijun Fang, Karl Ricanek
Abstract:
Quantitative trait loci (QTL) experiments have yielded important biological and biochemical information necessary for understanding the relationship between genetic markers and quantitative traits. For many years, most QTL algorithms only allowed one observation per genotype. Recently, there has been an increasing demand for QTL algorithms that can accommodate more than one observation per genotypic distribution. The Bayesian hierarchical model is very flexible and can easily incorporate this information into the model. Herein a methodology is presented that uses a Bayesian hierarchical model to capture the complexity of the data. Furthermore, the Markov chain Monte Carlo model composition (MC3) algorithm is used to search and identify important markers. An extensive simulation study illustrates that the method captures the true QTL, even under nonnormal noise and up to 6 QTL.Keywords: Bayesian hierarchical model, Markov chain MonteCarlo model composition, quantitative trait loci.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19637399 Analysis of Diverse Cluster Ensemble Techniques
Authors: S. Sarumathi, N. Shanthi, P. Ranjetha
Abstract:
Data mining is the procedure of determining interesting patterns from the huge amount of data. With the intention of accessing the data faster the most supporting processes needed is clustering. Clustering is the process of identifying similarity between data according to the individuality present in the data and grouping associated data objects into clusters. Cluster ensemble is the technique to combine various runs of different clustering algorithms to obtain a general partition of the original dataset, aiming for consolidation of outcomes from a collection of individual clustering outcomes. The performances of clustering ensembles are mainly affecting by two principal factors such as diversity and quality. This paper presents the overview about the different cluster ensemble algorithm along with their methods used in cluster ensemble to improve the diversity and quality in the several cluster ensemble related papers and shows the comparative analysis of different cluster ensemble also summarize various cluster ensemble methods. Henceforth this clear analysis will be very useful for the world of clustering experts and also helps in deciding the most appropriate one to determine the problem in hand.Keywords: Cluster Ensemble, Consensus Function, CSPA, Diversity, HGPA, MCLA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18427398 A Distributed Approach to Extract High Utility Itemsets from XML Data
Authors: S. Kannimuthu, K. Premalatha
Abstract:
This paper investigates a new data mining capability that entails mining of High Utility Itemsets (HUI) in a distributed environment. Existing research in data mining deals with only presence or absence of an items and do not consider the semantic measures like weight or cost of the items. Thus, HUI mining algorithm has evolved. HUI mining is the one kind of utility mining concept, aims to identify itemsets whose utility satisfies a given threshold. Although, the approach of mining HUIs in a distributed environment and mining of the same from XML data have not explored yet. In this work, a novel approach is proposed to mine HUIs from the XML based data in a distributed environment. This work utilizes Service Oriented Computing (SOC) paradigm which provides Knowledge as a Service (KaaS). The interesting patterns are provided via the web services with the help of knowledge server to answer the queries of the consumers. The performance of the approach is evaluated on various databases using execution time and memory consumption.
Keywords: Data mining, Knowledge as a Service, service oriented computing, utility mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24557397 On the Network Packet Loss Tolerance of SVM Based Activity Recognition
Authors: Gamze Uslu, Sebnem Baydere, Alper K. Demir
Abstract:
In this study, data loss tolerance of Support Vector Machines (SVM) based activity recognition model and multi activity classification performance when data are received over a lossy wireless sensor network is examined. Initially, the classification algorithm we use is evaluated in terms of resilience to random data loss with 3D acceleration sensor data for sitting, lying, walking and standing actions. The results show that the proposed classification method can recognize these activities successfully despite high data loss. Secondly, the effect of differentiated quality of service performance on activity recognition success is measured with activity data acquired from a multi hop wireless sensor network, which introduces high data loss. The effect of number of nodes on the reliability and multi activity classification success is demonstrated in simulation environment. To the best of our knowledge, the effect of data loss in a wireless sensor network on activity detection success rate of an SVM based classification algorithm has not been studied before.
Keywords: Activity recognition, support vector machines, acceleration sensor, wireless sensor networks, packet loss.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2871