Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 42456

Search results for: photogrammetric data analysis

42396 Bayesian Borrowing Methods for Count Data: Analysis of Incontinence Episodes in Patients with Overactive Bladder

Authors: Akalu Banbeta, Emmanuel Lesaffre, Reynaldo Martina, Joost Van Rosmalen

Abstract:

Including data from previous studies (historical data) in the analysis of the current study may reduce the sample size requirement and/or increase the power of analysis. The most common example is incorporating historical control data in the analysis of a current clinical trial. However, this only applies when the historical control dataare similar enough to the current control data. Recently, several Bayesian approaches for incorporating historical data have been proposed, such as the meta-analytic-predictive (MAP) prior and the modified power prior (MPP) both for single control as well as for multiple historical control arms. Here, we examine the performance of the MAP and the MPP approaches for the analysis of (over-dispersed) count data. To this end, we propose a computational method for the MPP approach for the Poisson and the negative binomial models. We conducted an extensive simulation study to assess the performance of Bayesian approaches. Additionally, we illustrate our approaches on an overactive bladder data set. For similar data across the control arms, the MPP approach outperformed the MAP approach with respect to thestatistical power. When the means across the control arms are different, the MPP yielded a slightly inflated type I error (TIE) rate, whereas the MAP did not. In contrast, when the dispersion parameters are different, the MAP gave an inflated TIE rate, whereas the MPP did not.We conclude that the MPP approach is more promising than the MAP approach for incorporating historical count data.

Keywords: count data, meta-analytic prior, negative binomial, poisson

Procedia PDF Downloads 123

42395 Application of UAS in Forest Firefighting for Detecting Ignitions and 3D Fuel Volume Estimation

Authors: Artur Krukowski, Emmanouela Vogiatzaki

Abstract:

The article presents results from the AF3 project “Advanced Forest Fire Fighting” focused on Unmanned Aircraft Systems (UAS)-based 3D surveillance and 3D area mapping using high-resolution photogrammetric methods from multispectral imaging, also taking advantage of the 3D scanning techniques from the SCAN4RECO project. We also present a proprietary embedded sensor system used for the detection of fire ignitions in the forest using near-infrared based scanner with weight and form factors allowing it to be easily deployed on standard commercial micro-UAVs, such as DJI Inspire or Mavic. Results from real-life pilot trials in Greece, Spain, and Israel demonstrated added-value in the use of UAS for precise and reliable detection of forest fires, as well as high-resolution 3D aerial modeling for accurate quantification of human resources and equipment required for firefighting.

Keywords: forest wildfires, surveillance, fuel volume estimation, firefighting, ignition detectors, 3D modelling, UAV

Procedia PDF Downloads 145

42394 Data Quality Enhancement with String Length Distribution

Authors: Qi Xiu, Hiromu Hota, Yohsuke Ishii, Takuya Oda

Abstract:

Recently, collectable manufacturing data are rapidly increasing. On the other hand, mega recall is getting serious as a social problem. Under such circumstances, there are increasing needs for preventing mega recalls by defect analysis such as root cause analysis and abnormal detection utilizing manufacturing data. However, the time to classify strings in manufacturing data by traditional method is too long to meet requirement of quick defect analysis. Therefore, we present String Length Distribution Classification method (SLDC) to correctly classify strings in a short time. This method learns character features, especially string length distribution from Product ID, Machine ID in BOM and asset list. By applying the proposal to strings in actual manufacturing data, we verified that the classification time of strings can be reduced by 80%. As a result, it can be estimated that the requirement of quick defect analysis can be fulfilled.

Keywords: string classification, data quality, feature selection, probability distribution, string length

Procedia PDF Downloads 321

42393 Microarray Data Visualization and Preprocessing Using R and Bioconductor

Authors: Ruchi Yadav, Shivani Pandey, Prachi Srivastava

Abstract:

Microarrays provide a rich source of data on the molecular working of cells. Each microarray reports on the abundance of tens of thousands of mRNAs. Virtually every human disease is being studied using microarrays with the hope of finding the molecular mechanisms of disease. Bioinformatics analysis plays an important part of processing the information embedded in large-scale expression profiling studies and for laying the foundation for biological interpretation. A basic, yet challenging task in the analysis of microarray gene expression data is the identification of changes in gene expression that are associated with particular biological conditions. Careful statistical design and analysis are essential to improve the efficiency and reliability of microarray experiments throughout the data acquisition and analysis process. One of the most popular platforms for microarray analysis is Bioconductor, an open source and open development software project based on the R programming language. This paper describes specific procedures for conducting quality assessment, visualization and preprocessing of Affymetrix Gene Chip and also details the different bioconductor packages used to analyze affymetrix microarray data and describe the analysis and outcome of each plots.

Keywords: microarray analysis, R language, affymetrix visualization, bioconductor

Procedia PDF Downloads 481

42392 Development of New Technology Evaluation Model by Using Patent Information and Customers' Review Data

Authors: Kisik Song, Kyuwoong Kim, Sungjoo Lee

Abstract:

Many global firms and corporations derive new technology and opportunity by identifying vacant technology from patent analysis. However, previous studies failed to focus on technologies that promised continuous growth in industrial fields. Most studies that derive new technology opportunities do not test practical effectiveness. Since previous studies depended on expert judgment, it became costly and time-consuming to evaluate new technologies based on patent analysis. Therefore, research suggests a quantitative and systematic approach to technology evaluation indicators by using patent data to and from customer communities. The first step involves collecting two types of data. The data is used to construct evaluation indicators and apply these indicators to the evaluation of new technologies. This type of data mining allows a new method of technology evaluation and better predictor of how new technologies are adopted.

Keywords: data mining, evaluating new technology, technology opportunity, patent analysis

Procedia PDF Downloads 382

42391 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data

Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin

Abstract:

Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.

Keywords: big data, machine learning, ontology model, urban data model

Procedia PDF Downloads 424

42390 Assessment of Rangeland Condition in a Dryland System Using UAV-Based Multispectral Imagery

Authors: Vistorina Amputu, Katja Tielboerger, Nichola Knox

Abstract:

Primary productivity in dry savannahs is constraint by moisture availability and under increasing anthropogenic pressure. Thus, considering climate change and the unprecedented pace and scale of rangeland deterioration, methods for assessing the status of such rangelands should be easy to apply, yield reliable and repeatable results that can be applied over large spatial scales. Global and local scale monitoring of rangelands through satellite data and labor-intensive field measurements respectively, are limited in accurately assessing the spatiotemporal heterogeneity of vegetation dynamics to provide crucial information that detects degradation in its early stages. Fortunately, newly emerging techniques such as unmanned aerial vehicles (UAVs), associated miniaturized sensors and improving digital photogrammetric software provide an opportunity to transcend these limitations. Yet, they have not been extensively calibrated in natural systems to encompass their complexities if they are to be integrated for long-term monitoring. Limited research using drone technology has been conducted in arid savannas, for example to assess the health status of this dynamic two-layer vegetation ecosystem. In our study, we fill this gap by testing the relationship between UAV-estimated cover of rangeland functional attributes and field data collected in discrete sample plots in a Namibian dryland savannah along a degradation gradient. The first results are based on a supervised classification performed on the ultra-high resolution multispectral imagery to distinguish between rangeland functional attributes (bare, non-woody, and woody), with a relatively good match to the field observations. Integrating UAV-based observations to improve rangeland monitoring could greatly assist in climate-adapted rangeland management.

Keywords: arid savannah, degradation gradient, field observations, narrow-band sensor, supervised classification

Procedia PDF Downloads 143

42389 High Performance Computing and Big Data Analytics

Authors: Branci Sarra, Branci Saadia

Abstract:

Because of the multiplied data growth, many computer science tools have been developed to process and analyze these Big Data. High-performance computing architectures have been designed to meet the treatment needs of Big Data (view transaction processing standpoint, strategic, and tactical analytics). The purpose of this article is to provide a historical and global perspective on the recent trend of high-performance computing architectures especially what has a relation with Analytics and Data Mining.

Keywords: high performance computing, HPC, big data, data analysis

Procedia PDF Downloads 524

42388 Efficiency of DMUs in Presence of New Inputs and Outputs in DEA

Authors: Esmat Noroozi, Elahe Sarfi, Farha Hosseinzadeh Lotfi

Abstract:

Examining the impacts of data modification is considered as sensitivity analysis. A lot of studies have considered the data modification of inputs and outputs in DEA. The issues which has not heretofore been considered in DEA sensitivity analysis is modification in the number of inputs and (or) outputs and determining the impacts of this modification in the status of efficiency of DMUs. This paper is going to present systems that show the impacts of adding one or multiple inputs or outputs on the status of efficiency of DMUs and furthermore a model is presented for recognizing the minimum number of inputs and (or) outputs from among specified inputs and outputs which can be added whereas an inefficient DMU will become efficient. Finally the presented systems and model have been utilized for a set of real data and the results have been reported.

Keywords: data envelopment analysis, efficiency, sensitivity analysis, input, out put

Procedia PDF Downloads 451

42387 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Saeed Hassan Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analysing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics

Procedia PDF Downloads 575

42386 Increasing the Speed of the Apriori Algorithm by Dimension Reduction

Authors: A. Abyar, R. Khavarzadeh

Abstract:

The most basic and important decision-making tool for industrial and service managers is understanding the market and customer behavior. In this regard, the Apriori algorithm, as one of the well-known machine learning methods, is used to identify customer preferences. On the other hand, with the increasing diversity of goods and services and the speed of changing customer behavior, we are faced with big data. Also, due to the large number of competitors and changing customer behavior, there is an urgent need for continuous analysis of this big data. While the speed of the Apriori algorithm decreases with increasing data volume. In this paper, the big data PCA method is used to reduce the dimension of the data in order to increase the speed of Apriori algorithm. Then, in the simulation section, the results are examined by generating data with different volumes and different diversity. The results show that when using this method, the speed of the a priori algorithm increases significantly.

Keywords: association rules, Apriori algorithm, big data, big data PCA, market basket analysis

Procedia PDF Downloads 13

42385 Statistical Analysis of Interferon-γ for the Effectiveness of an Anti-Tuberculous Treatment

Authors: Shishen Xie, Yingda L. Xie

Abstract:

Tuberculosis (TB) is a potentially serious infectious disease that remains a health concern. The Interferon Gamma Release Assay (IGRA) is a blood test to find out if an individual is tuberculous positive or negative. This study applies statistical analysis to the clinical data of interferon-gamma levels of seventy-three subjects who diagnosed pulmonary TB in an anti-tuberculous treatment. Data analysis is performed to determine if there is a significant decline in interferon-gamma levels for the subjects during a period of six months, and to infer if the anti-tuberculous treatment is effective.

Keywords: data analysis, interferon gamma release assay, statistical methods, tuberculosis infection

Procedia PDF Downloads 308

42384 Rigorous Photogrammetric Push-Broom Sensor Modeling for Lunar and Planetary Image Processing

Authors: Ahmed Elaksher, Islam Omar

Abstract:

Accurate geometric relation algorithms are imperative in Earth and planetary satellite and aerial image processing, particularly for high-resolution images that are used for topographic mapping. Most of these satellites carry push-broom sensors. These sensors are optical scanners equipped with linear arrays of CCDs. These sensors have been deployed on most EOSs. In addition, the LROC is equipped with two push NACs that provide 0.5 meter-scale panchromatic images over a 5 km swath of the Moon. The HiRISE carried by the MRO and the HRSC carried by MEX are examples of push-broom sensor that produces images of the surface of Mars. Sensor models developed in photogrammetry relate image space coordinates in two or more images with the 3D coordinates of ground features. Rigorous sensor models use the actual interior orientation parameters and exterior orientation parameters of the camera, unlike approximate models. In this research, we generate a generic push-broom sensor model to process imageries acquired through linear array cameras and investigate its performance, advantages, and disadvantages in generating topographic models for the Earth, Mars, and the Moon. We also compare and contrast the utilization, effectiveness, and applicability of available photogrammetric techniques and softcopies with the developed model. We start by defining an image reference coordinate system to unify image coordinates from all three arrays. The transformation from an image coordinate system to a reference coordinate system involves a translation and three rotations. For any image point within the linear array, its image reference coordinates, the coordinates of the exposure center of the array in the ground coordinate system at the imaging epoch (t), and the corresponding ground point coordinates are related through the collinearity condition that states that all these three points must be on the same line. The rotation angles for each CCD array at the epoch t are defined and included in the transformation model. The exterior orientation parameters of an image line, i.e., coordinates of exposure station and rotation angles, are computed by a polynomial interpolation function in time (t). The parameter (t) is the time at a certain epoch from a certain orbit position. Depending on the types of observations, coordinates, and parameters may be treated as knowns or unknowns differently in various situations. The unknown coefficients are determined in a bundle adjustment. The orientation process starts by extracting the sensor position and, orientation and raw images from the PDS. The parameters of each image line are then estimated and imported into the push-broom sensor model. We also define tie points between image pairs to aid the bundle adjustment model, determine the refined camera parameters, and generate highly accurate topographic maps. The model was tested on different satellite images such as IKONOS, QuickBird, and WorldView-2, HiRISE. It was found that the accuracy of our model is comparable to those of commercial and open-source software, the computational efficiency of the developed model is high, the model could be used in different environments with various sensors, and the implementation process is much more cost-and effort-consuming.

Keywords: photogrammetry, push-broom sensors, IKONOS, HiRISE, collinearity condition

Procedia PDF Downloads 67

42383 Enabling Quantitative Urban Sustainability Assessment with Big Data

Authors: Changfeng Fu

Abstract:

Sustainable urban development has been widely accepted a common sense in the modern urban planning and design. However, the measurement and assessment of urban sustainability, especially the quantitative assessment have been always an issue obsessing planning and design professionals. This paper will present an on-going research on the principles and technologies to develop a quantitative urban sustainability assessment principles and techniques which aim to integrate indicators, geospatial and geo-reference data, and assessment techniques together into a mechanism. It is based on the principles and techniques of geospatial analysis with GIS and statistical analysis methods. The decision-making technologies and methods such as AHP and SMART are also adopted to address overall assessment conclusions. The possible interfaces and presentation of data and quantitative assessment results are also described. This research is based on the knowledge, situations and data sources of UK, but it is potentially adaptable to other countries or regions. The implementation potentials of the mechanism are also discussed.

Keywords: urban sustainability assessment, quantitative analysis, sustainability indicator, geospatial data, big data

Procedia PDF Downloads 364

42382 Reconstructability Analysis for Landslide Prediction

Authors: David Percy

Abstract:

Landslides are a geologic phenomenon that affects a large number of inhabited places and are constantly being monitored and studied for the prediction of future occurrences. Reconstructability analysis (RA) is a methodology for extracting informative models from large volumes of data that work exclusively with discrete data. While RA has been used in medical applications and social science extensively, we are introducing it to the spatial sciences through applications like landslide prediction. Since RA works exclusively with discrete data, such as soil classification or bedrock type, working with continuous data, such as porosity, requires that these data are binned for inclusion in the model. RA constructs models of the data which pick out the most informative elements, independent variables (IVs), from each layer that predict the dependent variable (DV), landslide occurrence. Each layer included in the model retains its classification data as a primary encoding of the data. Unlike other machine learning algorithms that force the data into one-hot encoding type of schemes, RA works directly with the data as it is encoded, with the exception of continuous data, which must be binned. The usual physical and derived layers are included in the model, and testing our results against other published methodologies, such as neural networks, yields accuracy that is similar but with the advantage of a completely transparent model. The results of an RA session with a data set are a report on every combination of variables and their probability of landslide events occurring. In this way, every combination of informative state combinations can be examined.

Keywords: reconstructability analysis, machine learning, landslides, raster analysis

Procedia PDF Downloads 73

42381 BingleSeq: A User-Friendly R Package for Single-Cell RNA-Seq Data Analysis

Authors: Quan Gu, Daniel Dimitrov

Abstract:

BingleSeq was developed as a shiny-based, intuitive, and comprehensive application that enables the analysis of single-Cell RNA-Sequencing count data. This was achieved via incorporating three state-of-the-art software packages for each type of RNA sequencing analysis, alongside functional annotation analysis and a way to assess the overlap of differential expression method results. At its current state, the functionality implemented within BingleSeq is comparable to that of other applications, also developed with the purpose of lowering the entry requirements to RNA Sequencing analyses. BingleSeq is available on GitHub and will be submitted to R/Bioconductor.

Keywords: bioinformatics, functional annotation analysis, single-cell RNA-sequencing, transcriptomics

Procedia PDF Downloads 210

42380 Social Data Aggregator and Locator of Knowledge (STALK)

Authors: Rashmi Raghunandan, Sanjana Shankar, Rakshitha K. Bhat

Abstract:

Social media contributes a vast amount of data and information about individuals to the internet. This project will greatly reduce the need for unnecessary manual analysis of large and diverse social media profiles by filtering out and combining the useful information from various social media profiles, eliminating irrelevant data. It differs from the existing social media aggregators in that it does not provide a consolidated view of various profiles. Instead, it provides consolidated INFORMATION derived from the subject’s posts and other activities. It also allows analysis over multiple profiles and analytics based on several profiles. We strive to provide a query system to provide a natural language answer to questions when a user does not wish to go through the entire profile. The information provided can be filtered according to the different use cases it is used for.

Keywords: social network, analysis, Facebook, Linkedin, git, big data

Procedia PDF Downloads 446

42379 Wavelets Contribution on Textual Data Analysis

Authors: Habiba Ben Abdessalem

Abstract:

The emergence of giant set of textual data was the push that has encouraged researchers to invest in this field. The purpose of textual data analysis methods is to facilitate access to such type of data by providing various graphic visualizations. Applying these methods requires a corpus pretreatment step, whose standards are set according to the objective of the problem studied. This step determines the forms list contained in contingency table by keeping only those information carriers. This step may, however, lead to noisy contingency tables, so the use of wavelet denoising function. The validity of the proposed approach is tested on a text database that offers economic and political events in Tunisia for a well definite period.

Keywords: textual data, wavelet, denoising, contingency table

Procedia PDF Downloads 280

42378 Women Entrepreneurial Resiliency Amidst COVID-19

Authors: Divya Juneja, Sukhjeet Kaur Matharu

Abstract:

Purpose: The paper is aimed at identifying the challenging factors experienced by the women entrepreneurs in India in operating their enterprises amidst the challenges posed by the COVID-19 pandemic. Methodology: The sample for the study comprised 396 women entrepreneurs from different regions of India. A purposive sampling technique was adopted for data collection. Data was collected through a self-administered questionnaire. Analysis was performed using the SPSS package for quantitative data analysis. Findings: The results of the study state that entrepreneurial characteristics, resourcefulness, networking, adaptability, and continuity have a positive influence on the resiliency of women entrepreneurs when faced with a crisis situation. Practical Implications: The findings of the study have some important implications for women entrepreneurs, organizations, government, and other institutions extending support to entrepreneurs.

Keywords: women entrepreneurs, analysis, data analysis, positive influence, resiliency

Procedia PDF Downloads 118

42377 Generation of Quasi-Measurement Data for On-Line Process Data Analysis

Authors: Hyun-Woo Cho

Abstract:

For ensuring the safety of a manufacturing process one should quickly identify an assignable cause of a fault in an on-line basis. To this end, many statistical techniques including linear and nonlinear methods have been frequently utilized. However, such methods possessed a major problem of small sample size, which is mostly attributed to the characteristics of empirical models used for reference models. This work presents a new method to overcome the insufficiency of measurement data in the monitoring and diagnosis tasks. Some quasi-measurement data are generated from existing data based on the two indices of similarity and importance. The performance of the method is demonstrated using a real data set. The results turn out that the presented methods are able to handle the insufficiency problem successfully. In addition, it is shown to be quite efficient in terms of computational speed and memory usage, and thus on-line implementation of the method is straightforward for monitoring and diagnosis purposes.

Keywords: data analysis, diagnosis, monitoring, process data, quality control

Procedia PDF Downloads 485

42376 Spatial Econometric Approaches for Count Data: An Overview and New Directions

Authors: Paula Simões, Isabel Natário

Abstract:

This paper reviews a number of theoretical aspects for implementing an explicit spatial perspective in econometrics for modelling non-continuous data, in general, and count data, in particular. It provides an overview of the several spatial econometric approaches that are available to model data that are collected with reference to location in space, from the classical spatial econometrics approaches to the recent developments on spatial econometrics to model count data, in a Bayesian hierarchical setting. Considerable attention is paid to the inferential framework, necessary for structural consistent spatial econometric count models, incorporating spatial lag autocorrelation, to the corresponding estimation and testing procedures for different assumptions, to the constrains and implications embedded in the various specifications in the literature. This review combines insights from the classical spatial econometrics literature as well as from hierarchical modeling and analysis of spatial data, in order to look for new possible directions on the processing of count data, in a spatial hierarchical Bayesian econometric context.

Keywords: spatial data analysis, spatial econometrics, Bayesian hierarchical models, count data

Procedia PDF Downloads 598

42375 Vibrations of Springboards: Mode Shape and Time Domain Analysis

Authors: Stefano Frassinelli, Alessandro Niccolai, Riccardo E. Zich

Abstract:

Diving is an important Olympic sport. In this sport, the effective performance of the athlete is related to his capability to interact correctly with the springboard. In fact, the elevation of the jump and the correctness of the dive are influenced by the vibrations of the board. In this paper, the vibrations of the springboard will be analyzed by means of typical tools for vibration analysis: Firstly, a modal analysis will be done on two different models of the springboard, then, these two model and another one will be analyzed with a time analysis, done integrating the equations of motion od deformable bodies. All these analyses will be compared with experimental data measured on a real springboard by means of a 6-axis accelerometer; these measurements are aimed to assess the models proposed. The acquired data will be analyzed both in frequency domain and in time domain.

Keywords: springboard analysis, modal analysis, time domain analysis, vibrations

Procedia PDF Downloads 462

42374 Analysis of an Alternative Data Base for the Estimation of Solar Radiation

Authors: Graciela Soares Marcelli, Elison Eduardo Jardim Bierhals, Luciane Teresa Salvi, Claudineia Brazil, Rafael Haag

Abstract:

The sun is a source of renewable energy, and its use as both a source of heat and light is one of the most promising energy alternatives for the future. To measure the thermal or photovoltaic systems a solar irradiation database is necessary. Brazil still has a reduced number of meteorological stations that provide frequency tests, as an alternative to the radio data platform, with reanalysis systems, quite significant. ERA-Interim is a global fire reanalysis by the European Center for Medium-Range Weather Forecasts (ECMWF). The data assimilation system used for the production of ERA-Interim is based on a 2006 version of the IFS (Cy31r2). The system includes a 4-dimensional variable analysis (4D-Var) with a 12-hour analysis window. The spatial resolution of the dataset is approximately 80 km at 60 vertical levels from the surface to 0.1 hPa. This work aims to make a comparative analysis between the ERA-Interim data and the data observed in the Solarimmetric Atlas of the State of Rio Grande do Sul, to verify its applicability in the absence of an observed data network. The analysis of the results obtained for a study region as an alternative to the energy potential of a given region.

Keywords: energy potential, reanalyses, renewable energy, solar radiation

Procedia PDF Downloads 167

42373 Analysis of ECGs Survey Data by Applying Clustering Algorithm

Authors: Irum Matloob, Shoab Ahmad Khan, Fahim Arif

Abstract:

As Indo-pak has been the victim of heart diseases since many decades. Many surveys showed that percentage of cardiac patients is increasing in Pakistan day by day, and special attention is needed to pay on this issue. The framework is proposed for performing detailed analysis of ECG survey data which is conducted for measuring the prevalence of heart diseases statistics in Pakistan. The ECG survey data is evaluated or filtered by using automated Minnesota codes and only those ECGs are used for further analysis which is fulfilling the standardized conditions mentioned in the Minnesota codes. Then feature selection is performed by applying proposed algorithm based on discernibility matrix, for selecting relevant features from the database. Clustering is performed for exposing natural clusters from the ECG survey data by applying spectral clustering algorithm using fuzzy c means algorithm. The hidden patterns and interesting relationships which have been exposed after this analysis are useful for further detailed analysis and for many other multiple purposes.

Keywords: arrhythmias, centroids, ECG, clustering, discernibility matrix

Procedia PDF Downloads 355

42372 An Exploratory Research of Human Character Analysis Based on Smart Watch Data: Distinguish the Drinking State from Normal State

Authors: Lu Zhao, Yanrong Kang, Lili Guo, Yuan Long, Guidong Xing

Abstract:

Smart watches, as a handy device with rich functionality, has become one of the most popular wearable devices all over the world. Among the various function, the most basic is health monitoring. The monitoring data can be provided as an effective evidence or a clue for the detection of crime cases. For instance, the step counting data can help to determine whether the watch wearer was quiet or moving during the given time period. There is, however, still quite few research on the analysis of human character based on these data. The purpose of this research is to analyze the health monitoring data to distinguish the drinking state from normal state. The analysis result may play a role in cases involving drinking, such as drunk driving. The experiment mainly focused on finding the figures of smart watch health monitoring data that change with drinking and figuring up the change scope. The chosen subjects are mostly in their 20s, each of whom had been wearing the same smart watch for a week. Each subject drank for several times during the week, and noted down the begin and end time point of the drinking. The researcher, then, extracted and analyzed the health monitoring data from the watch. According to the descriptive statistics analysis, it can be found that the heart rate change when drinking. The average heart rate is about 10% higher than normal, the coefficient of variation is less than about 30% of the normal state. Though more research is needed to be carried out, this experiment and analysis provide a thought of the application of the data from smart watches.

Keywords: character analysis, descriptive statistics analysis, drink state, heart rate, smart watch

Procedia PDF Downloads 169

42371 To Handle Data-Driven Software Development Projects Effectively

Authors: Shahnewaz Khan

Abstract:

Machine learning (ML) techniques are often used in projects for creating data-driven applications. These tasks typically demand additional research and analysis. The proper technique and strategy must be chosen to ensure the success of data-driven projects. Otherwise, even exerting a lot of effort, the necessary development might not always be possible. In this post, an effort to examine the workflow of data-driven software development projects and its implementation process in order to describe how to manage a project successfully. Which will assist in minimizing the added workload.

Keywords: data, data-driven projects, data science, NLP, software project

Procedia PDF Downloads 88

42370 Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques

Authors: Sannikumar Patel, Brian Nolan, Markus Hofmann, Philip Owende, Kunjan Patel

Abstract:

Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.

Keywords: cross-language analysis, machine learning, machine translation, sentiment analysis

Procedia PDF Downloads 718

42369 Impact of Map Generalization in Spatial Analysis

Authors: Lin Li, P. G. R. N. I. Pussella

Abstract:

When representing spatial data and their attributes on different types of maps, the scale plays a key role in the process of map generalization. The process is consisted with two main operators such as selection and omission. Once some data were selected, they would undergo of several geometrical changing processes such as elimination, simplification, smoothing, exaggeration, displacement, aggregation and size reduction. As a result of these operations at different levels of data, the geometry of the spatial features such as length, sinuosity, orientation, perimeter and area would be altered. This would be worst in the case of preparation of small scale maps, since the cartographer has not enough space to represent all the features on the map. What the GIS users do is when they wanted to analyze a set of spatial data; they retrieve a data set and does the analysis part without considering very important characteristics such as the scale, the purpose of the map and the degree of generalization. Further, the GIS users use and compare different maps with different degrees of generalization. Sometimes, GIS users are going beyond the scale of the source map using zoom in facility and violate the basic cartographic rule 'it is not suitable to create a larger scale map using a smaller scale map'. In the study, the effect of map generalization for GIS analysis would be discussed as the main objective. It was used three digital maps with different scales such as 1:10000, 1:50000 and 1:250000 which were prepared by the Survey Department of Sri Lanka, the National Mapping Agency of Sri Lanka. It was used common features which were on above three maps and an overlay analysis was done by repeating the data with different combinations. Road data, River data and Land use data sets were used for the study. A simple model, to find the best place for a wild life park, was used to identify the effects. The results show remarkable effects on different degrees of generalization processes. It can see that different locations with different geometries were received as the outputs from this analysis. The study suggests that there should be reasonable methods to overcome this effect. It can be recommended that, as a solution, it would be very reasonable to take all the data sets into a common scale and do the analysis part.

Keywords: generalization, GIS, scales, spatial analysis

Procedia PDF Downloads 332

42368 Protection of Cultural Heritage against the Effects of Climate Change Using Autonomous Aerial Systems Combined with Automated Decision Support

Authors: Artur Krukowski, Emmanouela Vogiatzaki

Abstract:

The article presents an ongoing work in research projects such as SCAN4RECO or ARCH, both funded by the European Commission under Horizon 2020 program. The former one concerns multimodal and multispectral scanning of Cultural Heritage assets for their digitization and conservation via spatiotemporal reconstruction and 3D printing, while the latter one aims to better preserve areas of cultural heritage from hazards and risks. It co-creates tools that would help pilot cities to save cultural heritage from the effects of climate change. It develops a disaster risk management framework for assessing and improving the resilience of historic areas to climate change and natural hazards. Tools and methodologies are designed for local authorities and practitioners, urban population, as well as national and international expert communities, aiding authorities in knowledge-aware decision making. In this article we focus on 3D modelling of object geometry using primarily photogrammetric methods to achieve very high model accuracy using consumer types of devices, attractive both to professions and hobbyists alike.

Keywords: 3D modelling, UAS, cultural heritage, preservation

Procedia PDF Downloads 127

42367 Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis

Authors: Sidi Yang, Haiyi Zhang

Abstract:

Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and opinions. Using a probabilistic Latent Dirichlet Allocation (LDA) topic model to discern the most popular topics in the Twitter data is an effective way to analyze a large set of tweets to find a set of topics in a computationally efficient manner. Sentiment analysis provides an effective method to show the emotions and sentiments found in each tweet and an efficient way to summarize the results in a manner that is clearly understood. The primary goal of this paper is to explore text mining, extract and analyze useful information from unstructured text using two approaches: LDA topic modelling and sentiment analysis by examining Twitter plain text data in English. These two methods allow people to dig data more effectively and efficiently. LDA topic model and sentiment analysis can also be applied to provide insight views in business and scientific fields.

Keywords: text mining, Twitter, topic model, sentiment analysis

Procedia PDF Downloads 182

‹
1
2
3
4
5
6
7
8
9
10
...
1415
1416
›