Search results for: Spatial data analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 13709

Search results for: Spatial data analysis

13349 Methodology for the Multi-Objective Analysis of Data Sets in Freight Delivery

Authors: Dale Dzemydiene, Aurelija Burinskiene, Arunas Miliauskas, Kristina Ciziuniene

Abstract:

Data flow and the purpose of reporting the data are different and dependent on business needs. Different parameters are reported and transferred regularly during freight delivery. This business practices form the dataset constructed for each time point and contain all required information for freight moving decisions. As a significant amount of these data is used for various purposes, an integrating methodological approach must be developed to respond to the indicated problem. The proposed methodology contains several steps: (1) collecting context data sets and data validation; (2) multi-objective analysis for optimizing freight transfer services. For data validation, the study involves Grubbs outliers analysis, particularly for data cleaning and the identification of statistical significance of data reporting event cases. The Grubbs test is often used as it measures one external value at a time exceeding the boundaries of standard normal distribution. In the study area, the test was not widely applied by authors, except when the Grubbs test for outlier detection was used to identify outsiders in fuel consumption data. In the study, the authors applied the method with a confidence level of 99%. For the multi-objective analysis, the authors would like to select the forms of construction of the genetic algorithms, which have more possibilities to extract the best solution. For freight delivery management, the schemas of genetic algorithms' structure are used as a more effective technique. Due to that, the adaptable genetic algorithm is applied for the description of choosing process of the effective transportation corridor. In this study, the multi-objective genetic algorithm methods are used to optimize the data evaluation and select the appropriate transport corridor. The authors suggest a methodology for the multi-objective analysis, which evaluates collected context data sets and uses this evaluation to determine a delivery corridor for freight transfer service in the multi-modal transportation network. In the multi-objective analysis, authors include safety components, the number of accidents a year, and freight delivery time in the multi-modal transportation network. The proposed methodology has practical value in the management of multi-modal transportation processes.

Keywords: Multi-objective decision support, analysis, data validation, freight delivery, multi-modal transportation, genetic programming methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 458
13348 Accelerating Side Channel Analysis with Distributed and Parallelized Processing

Authors: Kyunghee Oh, Dooho Choi

Abstract:

Although there is no theoretical weakness in a cryptographic algorithm, Side Channel Analysis can find out some secret data from the physical implementation of a cryptosystem. The analysis is based on extra information such as timing information, power consumption, electromagnetic leaks or even sound which can be exploited to break the system. Differential Power Analysis is one of the most popular analyses, as computing the statistical correlations of the secret keys and power consumptions. It is usually necessary to calculate huge data and takes a long time. It may take several weeks for some devices with countermeasures. We suggest and evaluate the methods to shorten the time to analyze cryptosystems. Our methods include distributed computing and parallelized processing.

Keywords: DPA, distributed computing, parallelized processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1878
13347 FCA-based Conceptual Knowledge Discovery in Folksonomy

Authors: Yu-Kyung Kang, Suk-Hyung Hwang, Kyoung-Mo Yang

Abstract:

The tagging data of (users, tags and resources) constitutes a folksonomy that is the user-driven and bottom-up approach to organizing and classifying information on the Web. Tagging data stored in the folksonomy include a lot of very useful information and knowledge. However, appropriate approach for analyzing tagging data and discovering hidden knowledge from them still remains one of the main problems on the folksonomy mining researches. In this paper, we have proposed a folksonomy data mining approach based on FCA for discovering hidden knowledge easily from folksonomy. Also we have demonstrated how our proposed approach can be applied in the collaborative tagging system through our experiment. Our proposed approach can be applied to some interesting areas such as social network analysis, semantic web mining and so on.

Keywords: Folksonomy data mining, formal concept analysis, collaborative tagging, conceptual knowledge discovery, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2003
13346 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Authors: K. Hardy, A. Maurushat

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: Big data, open data, productivity, transparency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1595
13345 Transformations of Spatial Distributions of Bio-Polymers and Nanoparticles in Water Suspensions Induced by Resonance-Like Low Frequency Electrical Fields

Authors: A. A. Vasin, N. V. Klassen, A. M. Likhter

Abstract:

Water suspensions of in-organic (metals and oxides) and organic nano-objects (chitozan and collagen) were subjected to the treatment of direct and alternative electrical fields. In addition to quasi-periodical spatial patterning resonance-like performance of spatial distributions of these suspensions has been found at low frequencies of alternating electrical field. These resonances are explained as the result of creation of equilibrium states of groups of charged nano-objects with opposite signs of charges at the interparticle distances where the forces of Coulomb attraction are compensated by the repulsion forces induced by relatively negative polarization of hydrated regions surrounding the nanoparticles with respect to pure water. The low frequencies of these resonances are explained by comparatively big distances between the particles and their big masses with t\respect to masses of atoms constituting molecules with high resonance frequencies. These new resonances open a new approach to detailed modeling and understanding of mechanisms of the influence of electrical fields on the functioning of internal organs of living organisms at the level of cells and neurons.

Keywords: Bio-polymers, chitosan, collagen, nanoparticles, coulomb attraction, polarization repulsion, periodical patterning, electrical low frequency resonances, transformations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1102
13344 SVM-based Multiview Face Recognition by Generalization of Discriminant Analysis

Authors: Dakshina Ranjan Kisku, Hunny Mehrotra, Jamuna Kanta Sing, Phalguni Gupta

Abstract:

Identity verification of authentic persons by their multiview faces is a real valued problem in machine vision. Multiview faces are having difficulties due to non-linear representation in the feature space. This paper illustrates the usability of the generalization of LDA in the form of canonical covariate for face recognition to multiview faces. In the proposed work, the Gabor filter bank is used to extract facial features that characterized by spatial frequency, spatial locality and orientation. Gabor face representation captures substantial amount of variations of the face instances that often occurs due to illumination, pose and facial expression changes. Convolution of Gabor filter bank to face images of rotated profile views produce Gabor faces with high dimensional features vectors. Canonical covariate is then used to Gabor faces to reduce the high dimensional feature spaces into low dimensional subspaces. Finally, support vector machines are trained with canonical sub-spaces that contain reduced set of features and perform recognition task. The proposed system is evaluated with UMIST face database. The experiment results demonstrate the efficiency and robustness of the proposed system with high recognition rates.

Keywords: Biometrics, Multiview face Recognition, Gaborwavelets, LDA, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1481
13343 Hydro-Geochemistry of Qare-Sou Catchment and Gorgan Gulf, Iran: Examining Spatial and Temporal Distribution of Major Ions and Determining the River’s Hydro-Chemical Type

Authors: Milad Kurdi, Hadi Farhadian, Teymour Eslamkish

Abstract:

This study examined the hydro-geochemistry of Qare-Sou catchment and Gorgan Gulf in order to determine the spatial distribution of major ions. In this regard, six hydrometer stations in the catchment and four stations in Gorgan Gulf were chosen and the samples were collected. Results of spatial and temporal distribution of major ions have shown similar variation trends for calcium, magnesium, and bicarbonate ions. Also, the spatial trend of chloride, sulfate, sodium and potassium ions were same as Electrical Conductivity (EC) and Total Dissolved Solid (TDS). In Nahar Khoran station, the concentrations of ions were more than other stations which may be related to human activities and the role of geology. The Siah Ab station’s ions showed high concentration which is may be related to the station’s close proximity to Gorgan Gulf and the return of water to Qare-Sou River. In order to determine the interaction of water and rock, the Gibbs diagram was used and the results showed that water of the river falls in the rock range and it is affected more by weathering and reaction between water and stone and less by evaporation and crystallization. Assessment of the quality of river water by using graphic methods indicated that the type of water in this area is Ca-HCO3-Mg. Major ions concentration in Qare-Sou in the universal average was more than but not more than the allowed limit by the World Health Organization and China Standard Organization. A comparison of ions concentration in Gorgan Gulf, seas and oceans showed that the pH in Gorgan Gulf was more than the other seas but in Gorgan Gulf the concentration of anion and cation was less than other seas.

Keywords: Hydro-geochemistry, Qare-Sou River, Gorgan Gulf, major ions, Gibbs diagram, water quality, graphical methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1718
13342 Model of Optimal Centroids Approach for Multivariate Data Classification

Authors: Pham Van Nha, Le Cam Binh

Abstract:

Particle swarm optimization (PSO) is a population-based stochastic optimization algorithm. PSO was inspired by the natural behavior of birds and fish in migration and foraging for food. PSO is considered as a multidisciplinary optimization model that can be applied in various optimization problems. PSO’s ideas are simple and easy to understand but PSO is only applied in simple model problems. We think that in order to expand the applicability of PSO in complex problems, PSO should be described more explicitly in the form of a mathematical model. In this paper, we represent PSO in a mathematical model and apply in the multivariate data classification. First, PSOs general mathematical model (MPSO) is analyzed as a universal optimization model. Then, Model of Optimal Centroids (MOC) is proposed for the multivariate data classification. Experiments were conducted on some benchmark data sets to prove the effectiveness of MOC compared with several proposed schemes.

Keywords: Analysis of optimization, artificial intelligence-based optimization, optimization for learning and data analysis, global optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 869
13341 A Mixture Model of Two Different Distributions Approach to the Analysis of Heterogeneous Survival Data

Authors: Ülkü Erişoğlu, Murat Erişoğlu, Hamza Erol

Abstract:

In this paper we propose a mixture of two different distributions such as Exponential-Gamma, Exponential-Weibull and Gamma-Weibull to model heterogeneous survival data. Various properties of the proposed mixture of two different distributions are discussed. Maximum likelihood estimations of the parameters are obtained by using the EM algorithm. Illustrative example based on real data are also given.

Keywords: Exponential-Gamma, Exponential-Weibull, Gamma-Weibull, EM Algorithm, Survival Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4044
13340 Parametric and Nonparametric Analysis of Breast Cancer Treatments

Authors: Chunling Cong, Chris.P.Tsokos

Abstract:

The objective of the present research manuscript is to perform parametric, nonparametric, and decision tree analysis to evaluate two treatments that are being used for breast cancer patients. Our study is based on utilizing real data which was initially used in “Tamoxifen with or without breast irradiation in women of 50 years of age or older with early breast cancer" [1], and the data is supplied to us by N.A. Ibrahim “Decision tree for competing risks survival probability in breast cancer study" [2]. We agree upon certain aspects of our findings with the published results. However, in this manuscript, we focus on relapse time of breast cancer patients instead of survival time and parametric analysis instead of semi-parametric decision tree analysis is applied to provide more precise recommendations of effectiveness of the two treatments with respect to reoccurrence of breast cancer.

Keywords: decision tree, breast cancer treatments, parametricanalysis, non-parametric analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2024
13339 2D Gabor Functions and FCMI Algorithm for Flaws Detection in Ultrasonic Images

Authors: Kechida Ahmed, Drai Redouane, Khelil Mohamed

Abstract:

In this paper we present a new approach to detecting a flaw in T.O.F.D (Time Of Flight Diffraction) type ultrasonic image based on texture features. Texture is one of the most important features used in recognizing patterns in an image. The paper describes texture features based on 2D Gabor functions, i.e., Gaussian shaped band-pass filters, with dyadic treatment of the radial spatial frequency range and multiple orientations, which represent an appropriate choice for tasks requiring simultaneous measurement in both space and frequency domains. The most relevant features are used as input data on a Fuzzy c-mean clustering classifier. The classes that exist are only two: 'defects' or 'no defects'. The proposed approach is tested on the T.O.F.D image achieved at the laboratory and on the industrial field.

Keywords: 2D Gabor Functions, flaw detection, fuzzy c-mean clustering, non destructive testing, texture analysis, T.O.F.D Image (Time of Flight Diffraction).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1725
13338 An Overview of Some High Order and Multi-Level Finite Difference Schemes in Computational Aeroacoustics

Authors: Appanah Rao Appadu, Muhammad Zaid Dauhoo

Abstract:

In this paper, we have combined some spatial derivatives with the optimised time derivative proposed by Tam and Webb in order to approximate the linear advection equation which is given by = 0. Ôêé Ôêé + Ôêé Ôêé x f t u These spatial derivatives are as follows: a standard 7-point 6 th -order central difference scheme (ST7), a standard 9-point 8 th -order central difference scheme (ST9) and optimised schemes designed by Tam and Webb, Lockard et al., Zingg et al., Zhuang and Chen, Bogey and Bailly. Thus, these seven different spatial derivatives have been coupled with the optimised time derivative to obtain seven different finite-difference schemes to approximate the linear advection equation. We have analysed the variation of the modified wavenumber and group velocity, both with respect to the exact wavenumber for each spatial derivative. The problems considered are the 1-D propagation of a Boxcar function, propagation of an initial disturbance consisting of a sine and Gaussian function and the propagation of a Gaussian profile. It is known that the choice of the cfl number affects the quality of results in terms of dissipation and dispersion characteristics. Based on the numerical experiments solved and numerical methods used to approximate the linear advection equation, it is observed in this work, that the quality of results is dependent on the choice of the cfl number, even for optimised numerical methods. The errors from the numerical results have been quantified into dispersion and dissipation using a technique devised by Takacs. Also, the quantity, Exponential Error for Low Dispersion and Low Dissipation, eeldld has been computed from the numerical results. Moreover, based on this work, it has been found that when the quantity, eeldld can be used as a measure of the total error. In particular, the total error is a minimum when the eeldld is a minimum.

Keywords: Optimised time derivative, dissipation, dispersion, cfl number, Nomenclature: k : time step, h : spatial step, β :advection velocity, r: cfl/Courant number, hkrβ= , w =θ, h : exact wave number, n :time level, RPE : Relative phase error per unit time step, AFM :modulus of amplification factor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1612
13337 The Emerging Central Business District (CBD) in Lafia Town, Nigeria, and its Related Urban Planning Problems

Authors: Barau Daniel, Bashayi Obadiah

Abstract:

A spatial analysis of a large 20th century urban settlement (town/city) easily presents the celebrated central Business District (CBD). Theories of Urban Land Economics have easily justified and attempted to explain the existence of such a district activity area within the cityscape. This work examines the gradual emergence and development of the CBD in Lafia Town, Nigeria over 20 years and the attended urban problems caused by its emergence. Personal knowledge and observation of land use change are the main sources of data for the work, with unstructured interview with residents. The result are that the absence of a co-ordinate land use plan for the town, multi-nuclei nature, and regional location of surrounding towns have affected the growth pattern, hence the CBD. Traffic congestion, dispersed CBD land uses are some of the urban planning problems. The work concludes by advocating for integrating CBD uses.

Keywords: Urban planning, Central Business District (CBD), downtown.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4033
13336 Simulation of Organic Matter Variability on a Sugarbeet Field Using the Computer Based Geostatistical Methods

Authors: M. Rüstü Karaman, Tekin Susam, Fatih Er, Servet Yaprak, Osman Karkacıer

Abstract:

Computer based geostatistical methods can offer effective data analysis possibilities for agricultural areas by using vectorial data and their objective informations. These methods will help to detect the spatial changes on different locations of the large agricultural lands, which will lead to effective fertilization for optimal yield with reduced environmental pollution. In this study, topsoil (0-20 cm) and subsoil (20-40 cm) samples were taken from a sugar beet field by 20 x 20 m grids. Plant samples were also collected from the same plots. Some physical and chemical analyses for these samples were made by routine methods. According to derived variation coefficients, topsoil organic matter (OM) distribution was more than subsoil OM distribution. The highest C.V. value of 17.79% was found for topsoil OM. The data were analyzed comparatively according to kriging methods which are also used widely in geostatistic. Several interpolation methods (Ordinary,Simple and Universal) and semivariogram models (Spherical, Exponential and Gaussian) were tested in order to choose the suitable methods. Average standard deviations of values estimated by simple kriging interpolation method were less than average standard deviations (topsoil OM ± 0.48, N ± 0.37, subsoil OM ± 0.18) of measured values. The most suitable interpolation method was simple kriging method and exponantial semivariogram model for topsoil, whereas the best optimal interpolation method was simple kriging method and spherical semivariogram model for subsoil. The results also showed that these computer based geostatistical methods should be tested and calibrated for different experimental conditions and semivariogram models.

Keywords: Geostatistic, kriging, organic matter, sugarbeet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1550
13335 Evaluation of Clustering Based on Preprocessing in Gene Expression Data

Authors: Seo Young Kim, Toshimitsu Hamasaki

Abstract:

Microarrays have become the effective, broadly used tools in biological and medical research to address a wide range of problems, including classification of disease subtypes and tumors. Many statistical methods are available for analyzing and systematizing these complex data into meaningful information, and one of the main goals in analyzing gene expression data is the detection of samples or genes with similar expression patterns. In this paper, we express and compare the performance of several clustering methods based on data preprocessing including strategies of normalization or noise clearness. We also evaluate each of these clustering methods with validation measures for both simulated data and real gene expression data. Consequently, clustering methods which are common used in microarray data analysis are affected by normalization and degree of noise and clearness for datasets.

Keywords: Gene expression, clustering, data preprocessing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1718
13334 Analysis of Palm Perspiration Effect with SVM for Diabetes in People

Authors: Hamdi Melih Saraoğlu, Muhlis Yıldırım, Abdurrahman Özbeyaz, Feyzullah Temurtas

Abstract:

In this research, the diabetes conditions of people (healthy, prediabete and diabete) were tried to be identified with noninvasive palm perspiration measurements. Data clusters gathered from 200 subjects were used (1.Individual Attributes Cluster and 2. Palm Perspiration Attributes Cluster). To decrase the dimensions of these data clusters, Principal Component Analysis Method was used. Data clusters, prepared in that way, were classified with Support Vector Machines. Classifications with highest success were 82% for Glucose parameters and 84% for HbA1c parametres.

Keywords: Palm perspiration, Diabetes, Support Vector Machine, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1903
13333 Data Collection in Hospital Emergencies: A Questionnaire Survey

Authors: Nouha Mhimdi, Wahiba Ben Abdessalem Karaa, Henda Ben Ghezala

Abstract:

Many methods are used to collect data like questionnaires, surveys, focus group interviews. Or the collection of poor-quality data resulting, for example, from poorly designed questionnaires, the absence of good translators or interpreters, and the incorrect recording of data allow conclusions to be drawn that are not supported by the data or to focus only on the average effect of the program or policy. There are several solutions to avoid or minimize the most frequent errors, including obtaining expert advice on the design or adaptation of data collection instruments; or use technologies allowing better "anonymity" in the responses. In this context, and to overcome the aforementioned problems, we suggest in this paper an approach to achieve the collection of relevant data, by carrying out a large-scale questionnaire-based survey. We have been able to collect good quality, consistent and practical data on hospital emergencies to improve emergency services in hospitals, especially in the case of epidemics or pandemics.

Keywords: Data collection, survey, database, data analysis, hospital emergencies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 602
13332 Dimension Reduction of Microarray Data Based on Local Principal Component

Authors: Ali Anaissi, Paul J. Kennedy, Madhu Goyal

Abstract:

Analysis and visualization of microarraydata is veryassistantfor biologists and clinicians in the field of diagnosis and treatment of patients. It allows Clinicians to better understand the structure of microarray and facilitates understanding gene expression in cells. However, microarray dataset is a complex data set and has thousands of features and a very small number of observations. This very high dimensional data set often contains some noise, non-useful information and a small number of relevant features for disease or genotype. This paper proposes a non-linear dimensionality reduction algorithm Local Principal Component (LPC) which aims to maps high dimensional data to a lower dimensional space. The reduced data represents the most important variables underlying the original data. Experimental results and comparisons are presented to show the quality of the proposed algorithm. Moreover, experiments also show how this algorithm reduces high dimensional data whilst preserving the neighbourhoods of the points in the low dimensional space as in the high dimensional space.

Keywords: Linear Dimension Reduction, Non-Linear Dimension Reduction, Principal Component Analysis, Biologists.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1556
13331 Performance Analysis of Routing Protocol for WSN Using Data Centric Approach

Authors: A. H. Azni, Madihah Mohd Saudi, Azreen Azman, Ariff Syah Johari

Abstract:

Sensor Network are emerging as a new tool for important application in diverse fields like military surveillance, habitat monitoring, weather, home electrical appliances and others. Technically, sensor network nodes are limited in respect to energy supply, computational capacity and communication bandwidth. In order to prolong the lifetime of the sensor nodes, designing efficient routing protocol is very critical. In this paper, we illustrate the existing routing protocol for wireless sensor network using data centric approach and present performance analysis of these protocols. The paper focuses in the performance analysis of specific protocol namely Directed Diffusion and SPIN. This analysis reveals that the energy usage is important features which need to be taken into consideration while designing routing protocol for wireless sensor network.

Keywords: Data Centric Approach, Directed Diffusion, SPIN WSN Routing Protocol.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2513
13330 Dimensional Modeling of HIV Data Using Open Source

Authors: Charles D. Otine, Samuel B. Kucel, Lena Trojer

Abstract:

Selecting the data modeling technique for an information system is determined by the objective of the resultant data model. Dimensional modeling is the preferred modeling technique for data destined for data warehouses and data mining, presenting data models that ease analysis and queries which are in contrast with entity relationship modeling. The establishment of data warehouses as components of information system landscapes in many organizations has subsequently led to the development of dimensional modeling. This has been significantly more developed and reported for the commercial database management systems as compared to the open sources thereby making it less affordable for those in resource constrained settings. This paper presents dimensional modeling of HIV patient information using open source modeling tools. It aims to take advantage of the fact that the most affected regions by the HIV virus are also heavily resource constrained (sub-Saharan Africa) whereas having large quantities of HIV data. Two HIV data source systems were studied to identify appropriate dimensions and facts these were then modeled using two open source dimensional modeling tools. Use of open source would reduce the software costs for dimensional modeling and in turn make data warehousing and data mining more feasible even for those in resource constrained settings but with data available.

Keywords: About Database, Data Mining, Data warehouse, Dimensional Modeling, Open Source.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1934
13329 IMDC: An Image-Mapped Data Clustering Technique for Large Datasets

Authors: Faruq A. Al-Omari, Nabeel I. Al-Fayoumi

Abstract:

In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a small set of the data that contains critical boundary information sufficient to identify contained clusters. Compared to available data clustering techniques, the proposed algorithm produces similar quality results and outperforms them in execution time and storage requirements.

Keywords: Data clustering, Data mining, Image-mapping, Pattern discovery, Predictive analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1475
13328 Environmental Management System for Tourist Accommodations in Amphawa, Samut Songkram,Thailand

Authors: T. Utarasakul

Abstract:

Amphawa is the most popular weekend destination for both domestic and international tourists in Thailand. More than 112 homestays and resorts have been developed along the water resources. This research aims to initiate appropriate environmental management system for riverside tourist accommodations in Amphawa by investigating current environmental characteristics. Eighty-eight riverside tourist accommodations were survey from specific questionnaire, GPS data were also gathered for spatial analysis. The results revealed that the accommodations are welled manage in regards to some environmental aspects. In order to reduce economic costs, energy efficiency equipment is utilized. A substantial number of tourist accommodations encouraged waste separation, followed by transfer to local administration organization. Grease traps also utilized in order to decrease chemical discharged, grease and oil from canteen and restaurants on natural environment. The most notable mitigation is to initiate environmental friendly cleansers for tourist accommodation along the riverside in tourism destinations.

Keywords: Environmental Management System, Tourist Accommodations, Amphawa, Samut Songkram

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2403
13327 Urban Big Data: An Experimental Approach to Building-Value Estimation Using Web-Based Data

Authors: Sun-Young Jang, Sung-Ah Kim, Dongyoun Shin

Abstract:

Current real-estate value estimation, difficult for laymen, usually is performed by specialists. This paper presents an automated estimation process based on big data and machine-learning technology that calculates influences of building conditions on real-estate price measurement. The present study analyzed actual building sales sample data for Nonhyeon-dong, Gangnam-gu, Seoul, Korea, measuring the major influencing factors among the various building conditions. Further to that analysis, a prediction model was established and applied using RapidMiner Studio, a graphical user interface (GUI)-based tool for derivation of machine-learning prototypes. The prediction model is formulated by reference to previous examples. When new examples are applied, it analyses and predicts accordingly. The analysis process discerns the crucial factors effecting price increases by calculation of weighted values. The model was verified, and its accuracy determined, by comparing its predicted values with actual price increases.

Keywords: Big data, building-value analysis, machine learning, price prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1142
13326 Welding Process Selection for Storage Tank by Integrated Data Envelopment Analysis and Fuzzy Credibility Constrained Programming Approach

Authors: Rahmad Wisnu Wardana, Eakachai Warinsiriruk, Sutep Joy-A-Ka

Abstract:

Selecting the most suitable welding process usually depends on experiences or common application in similar companies. However, this approach generally ignores many criteria that can be affecting the suitable welding process selection. Therefore, knowledge automation through knowledge-based systems will significantly improve the decision-making process. The aims of this research propose integrated data envelopment analysis (DEA) and fuzzy credibility constrained programming approach for identifying the best welding process for stainless steel storage tank in the food and beverage industry. The proposed approach uses fuzzy concept and credibility measure to deal with uncertain data from experts' judgment. Furthermore, 12 parameters are used to determine the most appropriate welding processes among six competitive welding processes.

Keywords: Welding process selection, data envelopment analysis, fuzzy credibility constrained programming, storage tank.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 776
13325 Fault Detection of Drinking Water Treatment Process Using PCA and Hotelling's T2 Chart

Authors: Joval P George, Dr. Zheng Chen, Philip Shaw

Abstract:

This paper deals with the application of Principal Component Analysis (PCA) and the Hotelling-s T2 Chart, using data collected from a drinking water treatment process. PCA is applied primarily for the dimensional reduction of the collected data. The Hotelling-s T2 control chart was used for the fault detection of the process. The data was taken from a United Utilities Multistage Water Treatment Works downloaded from an Integrated Program Management (IPM) dashboard system. The analysis of the results show that Multivariate Statistical Process Control (MSPC) techniques such as PCA, and control charts such as Hotelling-s T2, can be effectively applied for the early fault detection of continuous multivariable processes such as Drinking Water Treatment. The software package SIMCA-P was used to develop the MSPC models and Hotelling-s T2 Chart from the collected data.

Keywords: Principal component analysis, hotelling's t2 chart, multivariate statistical process control, drinking water treatment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2753
13324 Hydrological Characterization of a Watershed for Streamflow Prediction

Authors: Oseni Taiwo Amoo, Bloodless Dzwairo

Abstract:

In this paper, we extend the versatility and usefulness of GIS as a methodology for any river basin hydrologic characteristics analysis (HCA). The Gurara River basin located in North-Central Nigeria is presented in this study. It is an on-going research using spatial Digital Elevation Model (DEM) and Arc-Hydro tools to take inventory of the basin characteristics in order to predict water abstraction quantification on streamflow regime. One of the main concerns of hydrological modelling is the quantification of runoff from rainstorm events. In practice, the soil conservation service curve (SCS) method and the Conventional procedure called rational technique are still generally used these traditional hydrological lumped models convert statistical properties of rainfall in river basin to observed runoff and hydrograph. However, the models give little or no information about spatially dispersed information on rainfall and basin physical characteristics. Therefore, this paper synthesizes morphometric parameters in generating runoff. The expected results of the basin characteristics such as size, area, shape, slope of the watershed and stream distribution network analysis could be useful in estimating streamflow discharge. Water resources managers and irrigation farmers could utilize the tool for determining net return from available scarce water resources, where past data records are sparse for the aspect of land and climate.

Keywords: Hydrological characteristic, land and climate, runoff discharge, streamflow.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1433
13323 Stakeholder Analysis of Agricultural Drone Policy: A Case Study of the Agricultural Drone Ecosystem of Thailand

Authors: Thanomsin Chakreeves, Atichat Preittigun, Ajchara Phu-ang

Abstract:

This paper presents a stakeholder analysis of agricultural drone policies that meet the government's goal of building an agricultural drone ecosystem in Thailand. Firstly, case studies from other countries are reviewed. The stakeholder analysis method and qualitative data from the interviews are then presented including data from the Institute of Innovation and Management, the Office of National Higher Education Science Research and Innovation Policy Council, agricultural entrepreneurs and farmers. Study and interview data are then employed to describe the current ecosystem and to guide the implementation of agricultural drone policies that are suitable for the ecosystem of Thailand. Finally, policy recommendations are then made that the Thai government should adopt in the future.

Keywords: Drone public policy, drone ecosystem, policy development, agricultural drone.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 768
13322 Q-Map: Clinical Concept Mining from Clinical Documents

Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala

Abstract:

Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.

Keywords: Information retrieval (IR), unified medical language system (UMLS), Syntax Based Analysis, natural language processing (NLP), medical informatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 752
13321 Parallelization of Ensemble Kalman Filter (EnKF) for Oil Reservoirs with Time-lapse Seismic Data

Authors: Md Khairullah, Hai-Xiang Lin, Remus G. Hanea, Arnold W. Heemink

Abstract:

In this paper we describe the design and implementation of a parallel algorithm for data assimilation with ensemble Kalman filter (EnKF) for oil reservoir history matching problem. The use of large number of observations from time-lapse seismic leads to a large turnaround time for the analysis step, in addition to the time consuming simulations of the realizations. For efficient parallelization it is important to consider parallel computation at the analysis step. Our experiments show that parallelization of the analysis step in addition to the forecast step has good scalability, exploiting the same set of resources with some additional efforts.

Keywords: EnKF, Data assimilation, Parallel computing, Parallel efficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2251
13320 The Effect of Fixing Kinesiology Tape onto the Plantar Surface during the Loading Phase of Gait

Authors: Albert K. Chong, Jasim Ahmed Ali Al-Baghdadi, Peter B. Milburn

Abstract:

Precise capture of plantar 3D surface of the foot at the loading gait phases on rigid substrates was found to be valuable for the assessment of the physiology, health and problems of the feet. Photogrammetry, a precision 3D spatial data capture technique is suitable for this type of dynamic application. In this research, the technique is utilised to study the plantar deformation as a result of having a strip of kinesiology tape on the plantar surface during the loading phase of gait. For this pilot study, one healthy adult male subject was recruited under the University’s human research ethics guidelines for this preliminary study. The 3D plantar deformation data with and without applying the tape were analysed. The results and analyses are presented together with detailed findings.

Keywords: Gait, human plantar, loading, Kinesiology Tape.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1879