Search results for: Umberto Nanni
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3

Search results for: Umberto Nanni

3 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: Cooccurrence graph, entity relation graph, unstructured text, weighted distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 627
2 A Case Study of the Digital Translation of the Lucy Lloyd and Wilhelm Bleek |Xam and !Kun Notebooks into The Digital Bleek and Lloyd

Authors: F. Saptouw

Abstract:

This paper will examine the digitization process of the |Xam and !Kun notebooks, authored by Lucy Lloyd, Dorothea Bleek and Wilhelm Bleek, and their collaborators |a!kunta, ||kabbo, ≠kasin, Dia!kwain, !kweiten ta ||ken, |han≠kass'o, !nanni, Tamme, |uma, and Da during the 19th century. Detail will be provided about the status of the archive, the creation of the digital archive and selected research projects linked to the archive. The Digital Bleek and Lloyd project is an example of institutional collaboration by the University of Cape Town, University of South Africa, Iziko South African Museum, the National Library of South Africa and the Western Cape Provincial Archives and Records Service. The contemporary value of the archive will be discussed in relation to its current manifestation as a collection of archival and digital objects, each with its own set of properties and archival risk factors. This tension between the two ways to access the archive will be interrogated to shed light on the slippages between the digital object and the archival object. The primary argument is that the process of digitization generates an ontological shift in the status of the archival object. The secondary argument is an engagement with practices to curate the encounters with these ontologically shifted objects and how to relate to each as a contemporary viewer. In conclusion this paper will argue for regarding these archival objects according to the interpretive framework utilized to engage secular relics.

Keywords: Archive, curatorship, digitization, The Digital Bleek and Lloyd.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 514
1 Heterogeneous-Resolution and Multi-Source Terrain Builder for CesiumJS WebGL Virtual Globe

Authors: Umberto Di Staso, Marco Soave, Alessio Giori, Federico Prandi, Raffaele De Amicis

Abstract:

The increasing availability of information about earth surface elevation (Digital Elevation Models DEM) generated from different sources (remote sensing, Aerial Images, Lidar) poses the question about how to integrate and make available to the most than possible audience this huge amount of data. In order to exploit the potential of 3D elevation representation the quality of data management plays a fundamental role. Due to the high acquisition costs and the huge amount of generated data, highresolution terrain surveys tend to be small or medium sized and available on limited portion of earth. Here comes the need to merge large-scale height maps that typically are made available for free at worldwide level, with very specific high resolute datasets. One the other hand, the third dimension increases the user experience and the data representation quality, unlocking new possibilities in data analysis for civil protection, real estate, urban planning, environment monitoring, etc. The open-source 3D virtual globes, which are trending topics in Geovisual Analytics, aim at improving the visualization of geographical data provided by standard web services or with proprietary formats. Typically, 3D Virtual globes like do not offer an open-source tool that allows the generation of a terrain elevation data structure starting from heterogeneous-resolution terrain datasets. This paper describes a technological solution aimed to set up a so-called “Terrain Builder”. This tool is able to merge heterogeneous-resolution datasets, and to provide a multi-resolution worldwide terrain services fully compatible with CesiumJS and therefore accessible via web using traditional browser without any additional plug-in.

Keywords: Terrain builder, WebGL, virtual globe, CesiumJS, tiled map service, TMS, height-map, regular grid, Geovisual analytics, DTM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2340