Search results for: jackknife resampling

2 Terrestrial Laser Scans to Assess Aerial LiDAR Data

Authors: J. F. Reinoso-Gordo, F. J. Ariza-López, A. Mozas-Calvache, J. L. García-Balboa, S. Eddargani

Abstract:

The DEMs quality may depend on several factors such as data source, capture method, processing type used to derive them, or the cell size of the DEM. The two most important capture methods to produce regional-sized DEMs are photogrammetry and LiDAR; DEMs covering entire countries have been obtained with these methods. The quality of these DEMs has traditionally been evaluated by the national cartographic agencies through punctual sampling that focused on its vertical component. For this type of evaluation there are standards such as NMAS and ASPRS Positional Accuracy Standards for Digital Geospatial Data. However, it seems more appropriate to carry out this evaluation by means of a method that takes into account the superficial nature of the DEM and, therefore, its sampling is superficial and not punctual. This work is part of the Research Project "Functional Quality of Digital Elevation Models in Engineering" where it is necessary to control the quality of a DEM whose data source is an experimental LiDAR flight with a density of 14 points per square meter to which we call Point Cloud Product (PCpro). In the present work it is described the capture data on the ground and the postprocessing tasks until getting the point cloud that will be used as reference (PCref) to evaluate the PCpro quality. Each PCref consists of a patch 50x50 m size coming from a registration of 4 different scan stations. The area studied was the Spanish region of Navarra that covers an area of 10,391 km2; 30 patches homogeneously distributed were necessary to sample the entire surface. The patches have been captured using a Leica BLK360 terrestrial laser scanner mounted on a pole that reached heights of up to 7 meters; the position of the scanner was inverted so that the characteristic shadow circle does not exist when the scanner is in direct position. To ensure that the accuracy of the PCref is greater than that of the PCpro, the georeferencing of the PCref has been carried out with real-time GNSS, and its accuracy positioning was better than 4 cm; this accuracy is much better than the altimetric mean square error estimated for the PCpro (<15 cm); The kind of DEM of interest is the corresponding to the bare earth, so that it was necessary to apply a filter to eliminate vegetation and auxiliary elements such as poles, tripods, etc. After the postprocessing tasks the PCref is ready to be compared with the PCpro using different techniques: cloud to cloud or after a resampling process DEM to DEM.

Keywords: data quality, DEM, LiDAR, terrestrial laser scanner, accuracy

Procedia PDF Downloads 100

1 Predictive Analytics for Theory Building

Authors: Ho-Won Jung, Donghun Lee, Hyung-Jin Kim

Abstract:

Predictive analytics (data analysis) uses a subset of measurements (the features, predictor, or independent variable) to predict another measurement (the outcome, target, or dependent variable) on a single person or unit. It applies empirical methods in statistics, operations research, and machine learning to predict the future, or otherwise unknown events or outcome on a single or person or unit, based on patterns in data. Most analyses of metabolic syndrome are not predictive analytics but statistical explanatory studies that build a proposed model (theory building) and then validate metabolic syndrome predictors hypothesized (theory testing). A proposed theoretical model forms with causal hypotheses that specify how and why certain empirical phenomena occur. Predictive analytics and explanatory modeling have their own territories in analysis. However, predictive analytics can perform vital roles in explanatory studies, i.e., scientific activities such as theory building, theory testing, and relevance assessment. In the context, this study is to demonstrate how to use our predictive analytics to support theory building (i.e., hypothesis generation). For the purpose, this study utilized a big data predictive analytics platform TM based on a co-occurrence graph. The co-occurrence graph is depicted with nodes (e.g., items in a basket) and arcs (direct connections between two nodes), where items in a basket are fully connected. A cluster is a collection of fully connected items, where the specific group of items has co-occurred in several rows in a data set. Clusters can be ranked using importance metrics, such as node size (number of items), frequency, surprise (observed frequency vs. expected), among others. The size of a graph can be represented by the numbers of nodes and arcs. Since the size of a co-occurrence graph does not depend directly on the number of observations (transactions), huge amounts of transactions can be represented and processed efficiently. For a demonstration, a total of 13,254 metabolic syndrome training data is plugged into the analytics platform to generate rules (potential hypotheses). Each observation includes 31 predictors, for example, associated with sociodemographic, habits, and activities. Some are intentionally included to get predictive analytics insights on variable selection such as cancer examination, house type, and vaccination. The platform automatically generates plausible hypotheses (rules) without statistical modeling. Then the rules are validated with an external testing dataset including 4,090 observations. Results as a kind of inductive reasoning show potential hypotheses extracted as a set of association rules. Most statistical models generate just one estimated equation. On the other hand, a set of rules (many estimated equations from a statistical perspective) in this study may imply heterogeneity in a population (i.e., different subpopulations with unique features are aggregated). Next step of theory development, i.e., theory testing, statistically tests whether a proposed theoretical model is a plausible explanation of a phenomenon interested in. If hypotheses generated are tested statistically with several thousand observations, most of the variables will become significant as the p-values approach zero. Thus, theory validation needs statistical methods utilizing a part of observations such as bootstrap resampling with an appropriate sample size.

Keywords: explanatory modeling, metabolic syndrome, predictive analytics, theory building

Procedia PDF Downloads 276