Search results for: data clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24698

Search results for: data clustering

23918 Data Presentation of Lane-Changing Events Trajectories Using HighD Dataset

Authors: Basma Khelfa, Antoine Tordeux, Ibrahima Ba

Abstract:

We present a descriptive analysis data of lane-changing events in multi-lane roads. The data are provided from The Highway Drone Dataset (HighD), which are microscopic trajectories in highway. This paper describes and analyses the role of the different parameters and their significance. Thanks to HighD data, we aim to find the most frequent reasons that motivate drivers to change lanes. We used the programming language R for the processing of these data. We analyze the involvement and relationship of different variables of each parameter of the ego vehicle and the four vehicles surrounding it, i.e., distance, speed difference, time gap, and acceleration. This was studied according to the class of the vehicle (car or truck), and according to the maneuver it undertook (overtaking or falling back).

Keywords: autonomous driving, physical traffic model, prediction model, statistical learning process

Procedia PDF Downloads 243
23917 Evaluation of Golden Beam Data for the Commissioning of 6 and 18 MV Photons Beams in Varian Linear Accelerator

Authors: Shoukat Ali, Abdul Qadir Jandga, Amjad Hussain

Abstract:

Objective: The main purpose of this study is to compare the Percent Depth dose (PDD) and In-plane and cross-plane profiles of Varian Golden beam data to the measured data of 6 and 18 MV photons for the commissioning of Eclipse treatment planning system. Introduction: Commissioning of treatment planning system requires an extensive acquisition of beam data for the clinical use of linear accelerators. Accurate dose delivery require to enter the PDDs, Profiles and dose rate tables for open and wedges fields into treatment planning system, enabling to calculate the MUs and dose distribution. Varian offers a generic set of beam data as a reference data, however not recommend for clinical use. In this study, we compared the generic beam data with the measured beam data to evaluate the reliability of generic beam data to be used for the clinical purpose. Methods and Material: PDDs and Profiles of Open and Wedge fields for different field sizes and at different depths measured as per Varian’s algorithm commissioning guideline. The measurement performed with PTW 3D-scanning water phantom with semi-flex ion chamber and MEPHYSTO software. The online available Varian Golden Beam Data compared with the measured data to evaluate the accuracy of the golden beam data to be used for the commissioning of Eclipse treatment planning system. Results: The deviation between measured vs. golden beam data was in the range of 2% max. In PDDs, the deviation increases more in the deeper depths than the shallower depths. Similarly, profiles have the same trend of increasing deviation at large field sizes and increasing depths. Conclusion: Study shows that the percentage deviation between measured and golden beam data is within the acceptable tolerance and therefore can be used for the commissioning process; however, verification of small subset of acquired data with the golden beam data should be mandatory before clinical use.

Keywords: percent depth dose, flatness, symmetry, golden beam data

Procedia PDF Downloads 470
23916 Variable-Fidelity Surrogate Modelling with Kriging

Authors: Selvakumar Ulaganathan, Ivo Couckuyt, Francesco Ferranti, Tom Dhaene, Eric Laermans

Abstract:

Variable-fidelity surrogate modelling offers an efficient way to approximate function data available in multiple degrees of accuracy each with varying computational cost. In this paper, a Kriging-based variable-fidelity surrogate modelling approach is introduced to approximate such deterministic data. Initially, individual Kriging surrogate models, which are enhanced with gradient data of different degrees of accuracy, are constructed. Then these Gradient enhanced Kriging surrogate models are strategically coupled using a recursive CoKriging formulation to provide an accurate surrogate model for the highest fidelity data. While, intuitively, gradient data is useful to enhance the accuracy of surrogate models, the primary motivation behind this work is to investigate if it is also worthwhile incorporating gradient data of varying degrees of accuracy.

Keywords: Kriging, CoKriging, Surrogate modelling, Variable- fidelity modelling, Gradients

Procedia PDF Downloads 540
23915 Robust Barcode Detection with Synthetic-to-Real Data Augmentation

Authors: Xiaoyan Dai, Hsieh Yisan

Abstract:

Barcode processing of captured images is a huge challenge, as different shooting conditions can result in different barcode appearances. This paper proposes a deep learning-based barcode detection using synthetic-to-real data augmentation. We first augment barcodes themselves; we then augment images containing the barcodes to generate a large variety of data that is close to the actual shooting environments. Comparisons with previous works and evaluations with our original data show that this approach achieves state-of-the-art performance in various real images. In addition, the system uses hybrid resolution for barcode “scan” and is applicable to real-time applications.

Keywords: barcode detection, data augmentation, deep learning, image-based processing

Procedia PDF Downloads 143
23914 Forming Form, Motivation and Their Biolinguistic Hypothesis: The Case of Consonant Iconicity in Tashelhiyt Amazigh and English

Authors: Noury Bakrim

Abstract:

When dealing with motivation/arbitrariness, forming form (Forma Formans) and morphodynamics are to be grasped as relevant implications of enunciation/enactment, schematization within the specificity of language as sound/meaning articulation. Thus, the fact that a language is a form does not contradict stasis/dynamic enunciation (reflexivity vs double articulation). Moreover, some languages exemplify the role of the forming form, uttering, and schematization (roots in Semitic languages, the Chinese case). Beyond the evolutionary biosemiotic process (form/substance bifurcation, the split between realization/representation), non-isomorphism/asymmetry between linguistic form/norm and linguistic realization (phonetics for instance) opens up a new horizon problematizing the role of Brain – sensorimotor contribution in the continuous forming form. Therefore, we hypothesize biotization as both process/trace co-constructing motivation/forming form. Henceforth, referring to our findings concerning distribution and motivation patterns within Berber written texts (pulse based obstruents and nasal-lateral levels in poetry) and oral storytelling (consonant intensity clustering in quantitative and semantic/prosodic motivation), we understand consonant clustering, motivation and schematization as a complex phenomenon partaking in patterns of oral/written iconic prosody and reflexive metalinguistic representation opening the stable form. We focus our inquiry on both Amazigh and English clusters (/spl/, /spr/) and iconic consonant iteration in [gnunnuy] (to roll/tumble), [smummuy] (to moan sadly or crankily). For instance, the syllabic structures of /splaeʃ/ and /splaet/ imply an anamorphic representation of the state of the world: splash, impact on aquatic surfaces/splat impact on the ground. The pair has stridency and distribution as distinctive features which specify its phonetic realization (and a part of its meaning) /ʃ/ is [+ strident] and /t/ is [+ distributed] on the vocal tract. Schematization is then a process relating both physiology/code as an arthron vocal/bodily, vocal/practical shaping of the motor-articulatory system, leading to syntactic/semantic thematization (agent/patient roles in /spl/, /sm/ and other clusters or the tense uvular /qq/ at the initial position in Berber). Furthermore, the productivity of serial syllable sequencing in Berber points out different expressivity forms. We postulate two Components of motivated formalization: i) the process of memory paradigmatization relating to sequence modeling under sensorimotor/verbal specific categories (production/perception), ii) the process of phonotactic selection - prosodic unconscious/subconscious distribution by virtue of iconicity. Basing on multiple tests including a questionnaire, phonotactic/visual recognition and oral/written reproduction, we aim at patterning/conceptualizing consonant schematization and motivation among EFL and Amazigh (Berber) learners and speakers integrating biolinguistic hypotheses.

Keywords: consonant motivation and prosody, language and order of life, anamorphic representation, represented representation, biotization, sensori-motor and brain representation, form, formalization and schematization

Procedia PDF Downloads 131
23913 Spatial Analysis and Determinants of Number of Antenatal Health Care Visit Among Pregnant Women in Ethiopia: Application of Spatial Multilevel Count Regression Models

Authors: Muluwerk Ayele Derebe

Abstract:

Background: Antenatal care (ANC) is an essential element in the continuum of reproductive health care for preventing preventable pregnancy-related morbidity and mortality. Objective: The aim of this study is to assess the spatial pattern and predictors of ANC visits in Ethiopia. Method: This study was done using Ethiopian Demographic and Health Survey data of 2016 among 7,174 pregnant women aged 15-49 years which was a nationwide community-based cross-sectional survey. Spatial analysis was done using Getis-Ord Gi* statistics to identify hot and cold spot areas of ANC visits. Multilevel glmmTMB packages adjusted for spatial effects were used in R software. Spatial multilevel count regression was conducted to identify predictors of antenatal care visits for pregnant women, and proportional change in variance was done to uncover the effect of individual and community-level factors of ANC visits. Results: The distribution of ANC visits was spatially clustered Moran’s I = 0.271, p<.0.001, ICC = 0.497, p<0.001). The highest spatial outlier areas of ANC visit was found in Amhara (South Wollo, Weast Gojjam, North Shewa), Oromo (west Arsi and East Harariga), Tigray (Central Tigray) and Benishangul-Gumuz (Asosa and Metekel) regions. The data was found with excess zeros (34.6%) and over-dispersed. The expected ANC visit of pregnant women with pregnancy complications was higher at 0.7868 [ARR= 2.1964, 95% CI: 1.8605, 2.5928, p-value <0.0001] compared to pregnant women who had no pregnancy complications. The expected ANC visit of a pregnant woman who lived in a rural area was 1.2254 times higher [ARR=3.4057, 95% CI: 2.1462, 5.4041, p-value <0.0001] as compared to a pregnant woman who lived in an urban. The study found dissimilar clusters with a low number of zero counts for a mean number of ANC visits surrounded by clusters with a higher number of counts of an average number of ANC visits when other variables held constant. Conclusion: This study found that the number of ANC visits in Ethiopia had a spatial pattern associated with socioeconomic, demographic, and geographic risk factors. Spatial clustering of ANC visits exists in all regions of Ethiopia. The predictor age of the mother, religion, mother’s education, husband’s education, mother's occupation, husband's occupation, signs of pregnancy complication, wealth index and marital status had a strong association with the number of ANC visits by each individual. At the community level, place of residence, region, age of the mother, sex of the household head, signs of pregnancy complications and distance to health facility factors had a strong association with the number of ANC visits.

Keywords: Ethiopia, ANC, spatial, multilevel, zero inflated Poisson

Procedia PDF Downloads 60
23912 Analysis of Delivery of Quad Play Services

Authors: Rahul Malhotra, Anurag Sharma

Abstract:

Fiber based access networks can deliver performance that can support the increasing demands for high speed connections. One of the new technologies that have emerged in recent years is Passive Optical Networks. This paper is targeted to show the simultaneous delivery of triple play service (data, voice, and video). The comparative investigation and suitability of various data rates is presented. It is demonstrated that as we increase the data rate, number of users to be accommodated decreases due to increase in bit error rate.

Keywords: FTTH, quad play, play service, access networks, data rate

Procedia PDF Downloads 392
23911 Classification of Manufacturing Data for Efficient Processing on an Edge-Cloud Network

Authors: Onyedikachi Ulelu, Andrew P. Longstaff, Simon Fletcher, Simon Parkinson

Abstract:

The widespread interest in 'Industry 4.0' or 'digital manufacturing' has led to significant research requiring the acquisition of data from sensors, instruments, and machine signals. In-depth research then identifies methods of analysis of the massive amounts of data generated before and during manufacture to solve a particular problem. The ultimate goal is for industrial Internet of Things (IIoT) data to be processed automatically to assist with either visualisation or autonomous system decision-making. However, the collection and processing of data in an industrial environment come with a cost. Little research has been undertaken on how to specify optimally what data to capture, transmit, process, and store at various levels of an edge-cloud network. The first step in this specification is to categorise IIoT data for efficient and effective use. This paper proposes the required attributes and classification to take manufacturing digital data from various sources to determine the most suitable location for data processing on the edge-cloud network. The proposed classification framework will minimise overhead in terms of network bandwidth/cost and processing time of machine tool data via efficient decision making on which dataset should be processed at the ‘edge’ and what to send to a remote server (cloud). A fast-and-frugal heuristic method is implemented for this decision-making. The framework is tested using case studies from industrial machine tools for machine productivity and maintenance.

Keywords: data classification, decision making, edge computing, industrial IoT, industry 4.0

Procedia PDF Downloads 161
23910 Denoising Transient Electromagnetic Data

Authors: Lingerew Nebere Kassie, Ping-Yu Chang, Hsin-Hua Huang, , Chaw-Son Chen

Abstract:

Transient electromagnetic (TEM) data plays a crucial role in hydrogeological and environmental applications, providing valuable insights into geological structures and resistivity variations. However, the presence of noise often hinders the interpretation and reliability of these data. Our study addresses this issue by utilizing a FASTSNAP system for the TEM survey, which operates at different modes (low, medium, and high) with continuous adjustments to discretization, gain, and current. We employ a denoising approach that processes the raw data obtained from each acquisition mode to improve signal quality and enhance data reliability. We use a signal-averaging technique for each mode, increasing the signal-to-noise ratio. Additionally, we utilize wavelet transform to suppress noise further while preserving the integrity of the underlying signals. This approach significantly improves the data quality, notably suppressing severe noise at late times. The resulting denoised data exhibits a substantially improved signal-to-noise ratio, leading to increased accuracy in parameter estimation. By effectively denoising TEM data, our study contributes to a more reliable interpretation and analysis of underground structures. Moreover, the proposed denoising approach can be seamlessly integrated into existing ground-based TEM data processing workflows, facilitating the extraction of meaningful information from noisy measurements and enhancing the overall quality and reliability of the acquired data.

Keywords: data quality, signal averaging, transient electromagnetic, wavelet transform

Procedia PDF Downloads 73
23909 Attribute Analysis of Quick Response Code Payment Users Using Discriminant Non-negative Matrix Factorization

Authors: Hironori Karachi, Haruka Yamashita

Abstract:

Recently, the system of quick response (QR) code is getting popular. Many companies introduce new QR code payment services and the services are competing with each other to increase the number of users. For increasing the number of users, we should grasp the difference of feature of the demographic information, usage information, and value of users between services. In this study, we conduct an analysis of real-world data provided by Nomura Research Institute including the demographic data of users and information of users’ usages of two services; LINE Pay, and PayPay. For analyzing such data and interpret the feature of them, Nonnegative Matrix Factorization (NMF) is widely used; however, in case of the target data, there is a problem of the missing data. EM-algorithm NMF (EMNMF) to complete unknown values for understanding the feature of the given data presented by matrix shape. Moreover, for comparing the result of the NMF analysis of two matrices, there is Discriminant NMF (DNMF) shows the difference of users features between two matrices. In this study, we combine EMNMF and DNMF and also analyze the target data. As the interpretation, we show the difference of the features of users between LINE Pay and Paypay.

Keywords: data science, non-negative matrix factorization, missing data, quality of services

Procedia PDF Downloads 114
23908 Developing Guidelines for Public Health Nurse Data Management and Use in Public Health Emergencies

Authors: Margaret S. Wright

Abstract:

Background/Significance: During many recent public health emergencies/disasters, public health nursing data has been missing or delayed, potentially impacting the decision-making and response. Data used as evidence for decision-making in response, planning, and mitigation has been erratic and slow, decreasing the ability to respond. Methodology: Applying best practices in data management and data use in public health settings, and guided by the concepts outlined in ‘Disaster Standards of Care’ models leads to the development of recommendations for a model of best practices in data management and use in public health disasters/emergencies by public health nurses. As the ‘patient’ in public health disasters/emergencies is the community (local, regional or national), guidelines for patient documentation are incorporated in the recommendations. Findings: Using model public health nurses could better plan how to prepare for, respond to, and mitigate disasters in their communities, and better participate in decision-making in all three phases bringing public health nursing data to the discussion as part of the evidence base for decision-making.

Keywords: data management, decision making, disaster planning documentation, public health nursing

Procedia PDF Downloads 203
23907 Genodata: The Human Genome Variation Using BigData

Authors: Surabhi Maiti, Prajakta Tamhankar, Prachi Uttam Mehta

Abstract:

Since the accomplishment of the Human Genome Project, there has been an unparalled escalation in the sequencing of genomic data. This project has been the first major vault in the field of medical research, especially in genomics. This project won accolades by using a concept called Bigdata which was earlier, extensively used to gain value for business. Bigdata makes use of data sets which are generally in the form of files of size terabytes, petabytes, or exabytes and these data sets were traditionally used and managed using excel sheets and RDBMS. The voluminous data made the process tedious and time consuming and hence a stronger framework called Hadoop was introduced in the field of genetic sciences to make data processing faster and efficient. This paper focuses on using SPARK which is gaining momentum with the advancement of BigData technologies. Cloud Storage is an effective medium for storage of large data sets which is generated from the genetic research and the resultant sets produced from SPARK analysis.

Keywords: human genome project, Bigdata, genomic data, SPARK, cloud storage, Hadoop

Procedia PDF Downloads 236
23906 Identification of Text Domains and Register Variation through the Analysis of Lexical Distribution in a Bangla Mass Media Text Corpus

Authors: Mahul Bhattacharyya, Niladri Sekhar Dash

Abstract:

The present research paper is an experimental attempt to investigate the nature of variation in the register in three major text domains, namely, social, cultural, and political texts collected from the corpus of Bangla printed mass media texts. This present study uses a corpus of a moderate amount of Bangla mass media text that contains nearly one million words collected from different media sources like newspapers, magazines, advertisements, periodicals, etc. The analysis of corpus data reveals that each text has certain lexical properties that not only control their identity but also mark their uniqueness across the domains. At first, the subject domains of the texts are classified into two parameters namely, ‘Genre' and 'Text Type'. Next, some empirical investigations are made to understand how the domains vary from each other in terms of lexical properties like both function and content words. Here the method of comparative-cum-contrastive matching of lexical load across domains is invoked through word frequency count to track how domain-specific words and terms may be marked as decisive indicators in the act of specifying the textual contexts and subject domains. The study shows that the common lexical stock that percolates across all text domains are quite dicey in nature as their lexicological identity does not have any bearing in the act of specifying subject domains. Therefore, it becomes necessary for language users to anchor upon certain domain-specific lexical items to recognize a text that belongs to a specific text domain. The eventual findings of this study confirm that texts belonging to different subject domains in Bangla news text corpus clearly differ on the parameters of lexical load, lexical choice, lexical clustering, lexical collocation. In fact, based on these parameters, along with some statistical calculations, it is possible to classify mass media texts into different types to mark their relation with regard to the domains they should actually belong. The advantage of this analysis lies in the proper identification of the linguistic factors which will give language users a better insight into the method they employ in text comprehension, as well as construct a systemic frame for designing text identification strategy for language learners. The availability of huge amount of Bangla media text data is useful for achieving accurate conclusions with a certain amount of reliability and authenticity. This kind of corpus-based analysis is quite relevant for a resource-poor language like Bangla, as no attempt has ever been made to understand how the structure and texture of Bangla mass media texts vary due to certain linguistic and extra-linguistic constraints that are actively operational to specific text domains. Since mass media language is assumed to be the most 'recent representation' of the actual use of the language, this study is expected to show how the Bangla news texts reflect the thoughts of the society and how they leave a strong impact on the thought process of the speech community.

Keywords: Bangla, corpus, discourse, domains, lexical choice, mass media, register, variation

Procedia PDF Downloads 163
23905 Ontology for a Voice Transcription of OpenStreetMap Data: The Case of Space Apprehension by Visually Impaired Persons

Authors: Said Boularouk, Didier Josselin, Eitan Altman

Abstract:

In this paper, we present a vocal ontology of OpenStreetMap data for the apprehension of space by visually impaired people. Indeed, the platform based on produsage gives a freedom to data producers to choose the descriptors of geocoded locations. Unfortunately, this freedom, called also folksonomy leads to complicate subsequent searches of data. We try to solve this issue in a simple but usable method to extract data from OSM databases in order to send them to visually impaired people using Text To Speech technology. We focus on how to help people suffering from visual disability to plan their itinerary, to comprehend a map by querying computer and getting information about surrounding environment in a mono-modal human-computer dialogue.

Keywords: TTS, ontology, open street map, visually impaired

Procedia PDF Downloads 279
23904 Design and Development of a Platform for Analyzing Spatio-Temporal Data from Wireless Sensor Networks

Authors: Walid Fantazi

Abstract:

The development of sensor technology (such as microelectromechanical systems (MEMS), wireless communications, embedded systems, distributed processing and wireless sensor applications) has contributed to a broad range of WSN applications which are capable of collecting a large amount of spatiotemporal data in real time. These systems require real-time data processing to manage storage in real time and query the data they process. In order to cover these needs, we propose in this paper a Snapshot spatiotemporal data model based on object-oriented concepts. This model allows saving storing and reducing data redundancy which makes it easier to execute spatiotemporal queries and save analyzes time. Further, to ensure the robustness of the system as well as the elimination of congestion from the main access memory we propose a spatiotemporal indexing technique in RAM called Captree *. As a result, we offer an RIA (Rich Internet Application) -based SOA application architecture which allows the remote monitoring and control.

Keywords: WSN, indexing data, SOA, RIA, geographic information system

Procedia PDF Downloads 239
23903 Prediction of Marine Ecosystem Changes Based on the Integrated Analysis of Multivariate Data Sets

Authors: Prozorkevitch D., Mishurov A., Sokolov K., Karsakov L., Pestrikova L.

Abstract:

The current body of knowledge about the marine environment and the dynamics of marine ecosystems includes a huge amount of heterogeneous data collected over decades. It generally includes a wide range of hydrological, biological and fishery data. Marine researchers collect these data and analyze how and why the ecosystem changes from past to present. Based on these historical records and linkages between the processes it is possible to predict future changes. Multivariate analysis of trends and their interconnection in the marine ecosystem may be used as an instrument for predicting further ecosystem evolution. A wide range of information about the components of the marine ecosystem for more than 50 years needs to be used to investigate how these arrays can help to predict the future.

Keywords: barents sea ecosystem, abiotic, biotic, data sets, trends, prediction

Procedia PDF Downloads 96
23902 Optical Fiber Data Throughput in a Quantum Communication System

Authors: Arash Kosari, Ali Araghi

Abstract:

A mathematical model for an optical-fiber communication channel is developed which results in an expression that calculates the throughput and loss of the corresponding link. The data are assumed to be transmitted by using of separate photons with different polarizations. The derived model also shows the dependency of data throughput with length of the channel and depolarization factor. It is observed that absorption of photons affects the throughput in a more intensive way in comparison with that of depolarization. Apart from that, the probability of depolarization and the absorption of radiated photons are obtained.

Keywords: absorption, data throughput, depolarization, optical fiber

Procedia PDF Downloads 274
23901 Offshore Outsourcing: Global Data Privacy Controls and International Compliance Issues

Authors: Michelle J. Miller

Abstract:

In recent year, there has been a rise of two emerging issues that impact the global employment and business market that the legal community must review closer: offshore outsourcing and data privacy. These two issues intersect because employment opportunities are shifting due to offshore outsourcing and some States, like the United States, anti-outsourcing legislation has been passed or presented to retain jobs within the country. In addition, the legal requirements to retain the privacy of data as a global employer extends to employees and third party service provides, including services outsourced to offshore locations. For this reason, this paper will review the intersection of these two issues with a specific focus on data privacy.

Keywords: outsourcing, data privacy, international compliance, multinational corporations

Procedia PDF Downloads 395
23900 Weighted Data Replication Strategy for Data Grid Considering Economic Approach

Authors: N. Mansouri, A. Asadi

Abstract:

Data Grid is a geographically distributed environment that deals with data intensive application in scientific and enterprise computing. Data replication is a common method used to achieve efficient and fault-tolerant data access in Grids. In this paper, a dynamic data replication strategy, called Enhanced Latest Access Largest Weight (ELALW) is proposed. This strategy is an enhanced version of Latest Access Largest Weight strategy. However, replication should be used wisely because the storage capacity of each Grid site is limited. Thus, it is important to design an effective strategy for the replication replacement task. ELALW replaces replicas based on the number of requests in future, the size of the replica, and the number of copies of the file. It also improves access latency by selecting the best replica when various sites hold replicas. The proposed replica selection selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue and the distance between nodes. Simulation results utilizing the OptorSim show our replication strategy achieve better performance overall than other strategies in terms of job execution time, effective network usage and storage resource usage.

Keywords: data grid, data replication, simulation, replica selection, replica placement

Procedia PDF Downloads 245
23899 Evaluation of Satellite and Radar Rainfall Product over Seyhan Plain

Authors: Kazım Kaba, Erdem Erdi, M. Akif Erdoğan, H. Mustafa Kandırmaz

Abstract:

Rainfall is crucial data source for very different discipline such as agriculture, hydrology and climate. Therefore rain rate should be known well both spatial and temporal for any area. Rainfall is measured by using rain-gauge at meteorological ground stations traditionally for many years. At the present time, rainfall products are acquired from radar and satellite images with a temporal and spatial continuity. In this study, we investigated the accuracy of these rainfall data according to rain-gauge data. For this purpose, we used Adana-Hatay radar hourly total precipitation product (RN1) and Meteosat convective rainfall rate (CRR) product over Seyhan plain. We calculated daily rainfall values from RN1 and CRR hourly precipitation products. We used the data of rainy days of four stations located within range of the radar from October 2013 to November 2015. In the study, we examined two rainfall data over Seyhan plain and the correlation between the rain-gauge data and two raster rainfall data was observed lowly.

Keywords: meteosat, radar, rainfall, rain-gauge, Turkey

Procedia PDF Downloads 307
23898 Spatial Data Mining by Decision Trees

Authors: Sihem Oujdi, Hafida Belbachir

Abstract:

Existing methods of data mining cannot be applied on spatial data because they require spatial specificity consideration, as spatial relationships. This paper focuses on the classification with decision trees, which are one of the data mining techniques. We propose an extension of the C4.5 algorithm for spatial data, based on two different approaches Join materialization and Querying on the fly the different tables. Similar works have been done on these two main approaches, the first - Join materialization - favors the processing time in spite of memory space, whereas the second - Querying on the fly different tables- promotes memory space despite of the processing time. The modified C4.5 algorithm requires three entries tables: a target table, a neighbor table, and a spatial index join that contains the possible spatial relationship among the objects in the target table and those in the neighbor table. Thus, the proposed algorithms are applied to a spatial data pattern in the accidentology domain. A comparative study of our approach with other works of classification by spatial decision trees will be detailed.

Keywords: C4.5 algorithm, decision trees, S-CART, spatial data mining

Procedia PDF Downloads 600
23897 Data-Driven Dynamic Overbooking Model for Tour Operators

Authors: Kannapha Amaruchkul

Abstract:

We formulate a dynamic overbooking model for a tour operator, in which most reservations contain at least two people. The cancellation rate and the timing of the cancellation may depend on the group size. We propose two overbooking policies, namely economic- and service-based. In an economic-based policy, we want to minimize the expected oversold and underused cost, whereas, in a service-based policy, we ensure that the probability of an oversold situation does not exceed the pre-specified threshold. To illustrate the applicability of our approach, we use tour package data in 2016-2018 from a tour operator in Thailand to build a data-driven robust optimization model, and we tested the proposed overbooking policy in 2019. We also compare the data-driven approach to the conventional approach of fitting data into a probability distribution.

Keywords: applied stochastic model, data-driven robust optimization, overbooking, revenue management, tour operator

Procedia PDF Downloads 116
23896 Modeling and Statistical Analysis of a Soap Production Mix in Bejoy Manufacturing Industry, Anambra State, Nigeria

Authors: Okolie Chukwulozie Paul, Iwenofu Chinwe Onyedika, Sinebe Jude Ebieladoh, M. C. Nwosu

Abstract:

The research work is based on the statistical analysis of the processing data. The essence is to analyze the data statistically and to generate a design model for the production mix of soap manufacturing products in Bejoy manufacturing company Nkpologwu, Aguata Local Government Area, Anambra state, Nigeria. The statistical analysis shows the statistical analysis and the correlation of the data. T test, Partial correlation and bi-variate correlation were used to understand what the data portrays. The design model developed was used to model the data production yield and the correlation of the variables show that the R2 is 98.7%. However, the results confirm that the data is fit for further analysis and modeling. This was proved by the correlation and the R-squared.

Keywords: General Linear Model, correlation, variables, pearson, significance, T-test, soap, production mix and statistic

Procedia PDF Downloads 427
23895 Helping the Development of Public Policies with Knowledge of Criminal Data

Authors: Diego De Castro Rodrigues, Marcelo B. Nery, Sergio Adorno

Abstract:

The project aims to develop a framework for social data analysis, particularly by mobilizing criminal records and applying descriptive computational techniques, such as associative algorithms and extraction of tree decision rules, among others. The methods and instruments discussed in this work will enable the discovery of patterns, providing a guided means to identify similarities between recurring situations in the social sphere using descriptive techniques and data visualization. The study area has been defined as the city of São Paulo, with the structuring of social data as the central idea, with a particular focus on the quality of the information. Given this, a set of tools will be validated, including the use of a database and tools for visualizing the results. Among the main deliverables related to products and the development of articles are the discoveries made during the research phase. The effectiveness and utility of the results will depend on studies involving real data, validated both by domain experts and by identifying and comparing the patterns found in this study with other phenomena described in the literature. The intention is to contribute to evidence-based understanding and decision-making in the social field.

Keywords: social data analysis, criminal records, computational techniques, data mining, big data

Procedia PDF Downloads 65
23894 Enhancing the Bionic Eye: A Real-time Image Optimization Framework to Encode Color and Spatial Information Into Retinal Prostheses

Authors: William Huang

Abstract:

Retinal prostheses are currently limited to low resolution grayscale images that lack color and spatial information. This study develops a novel real-time image optimization framework and tools to encode maximum information to the prostheses which are constrained by the number of electrodes. One key idea is to localize main objects in images while reducing unnecessary background noise through region-contrast saliency maps. A novel color depth mapping technique was developed through MiniBatchKmeans clustering and color space selection. The resulting image was downsampled using bicubic interpolation to reduce image size while preserving color quality. In comparison to current schemes, the proposed framework demonstrated better visual quality in tested images. The use of the region-contrast saliency map showed improvements in efficacy up to 30%. Finally, the computational speed of this algorithm is less than 380 ms on tested cases, making real-time retinal prostheses feasible.

Keywords: retinal implants, virtual processing unit, computer vision, saliency maps, color quantization

Procedia PDF Downloads 131
23893 Optimization of Real Time Measured Data Transmission, Given the Amount of Data Transmitted

Authors: Michal Kopcek, Tomas Skulavik, Michal Kebisek, Gabriela Krizanova

Abstract:

The operation of nuclear power plants involves continuous monitoring of the environment in their area. This monitoring is performed using a complex data acquisition system, which collects status information about the system itself and values of many important physical variables e.g. temperature, humidity, dose rate etc. This paper describes a proposal and optimization of communication that takes place in teledosimetric system between the central control server responsible for the data processing and storing and the decentralized measuring stations, which are measuring the physical variables. Analyzes of ongoing communication were performed and consequently the optimization of the system architecture and communication was done.

Keywords: communication protocol, transmission optimization, data acquisition, system architecture

Procedia PDF Downloads 500
23892 A Fuzzy Approach to Liver Tumor Segmentation with Zernike Moments

Authors: Abder-Rahman Ali, Antoine Vacavant, Manuel Grand-Brochier, Adélaïde Albouy-Kissi, Jean-Yves Boire

Abstract:

In this paper, we present a new segmentation approach for liver lesions in regions of interest within MRI (Magnetic Resonance Imaging). This approach, based on a two-cluster Fuzzy C-Means methodology, considers the parameter variable compactness to handle uncertainty. Fine boundaries are detected by a local recursive merging of ambiguous pixels with a sequential forward floating selection with Zernike moments. The method has been tested on both synthetic and real images. When applied on synthetic images, the proposed approach provides good performance, segmentations obtained are accurate, their shape is consistent with the ground truth, and the extracted information is reliable. The results obtained on MR images confirm such observations. Our approach allows, even for difficult cases of MR images, to extract a segmentation with good performance in terms of accuracy and shape, which implies that the geometry of the tumor is preserved for further clinical activities (such as automatic extraction of pharmaco-kinetics properties, lesion characterization, etc).

Keywords: defuzzification, floating search, fuzzy clustering, Zernike moments

Procedia PDF Downloads 440
23891 Network Impact of a Social Innovation Initiative in Rural Areas of Southern Italy

Authors: A. M. Andriano, M. Lombardi, A. Lopolito, M. Prosperi, A. Stasi, E. Iannuzzi

Abstract:

In according to the scientific debate on the definition of Social Innovation (SI), the present paper identifies SI as new ideas (products, services, and models) that simultaneously meet social needs and create new social relationships or collaborations. This concept offers important tools to unravel the difficult condition for the agricultural sector in marginalized areas, characterized by the abandonment of activities, low level of farmer education, and low generational renewal, hampering new territorial strategies addressed at and integrated and sustainable development. Models of SI in agriculture, starting from bottom up approach or from the community, are considered to represent the driving force of an ecological and digital revolution. A system based on SI may be able to grasp and satisfy individual and social needs and to promote new forms of entrepreneurship. In this context, Vazapp ('Go Hoeing') is an emerging SI model in southern Italy that promotes solutions for satisfying needs of farmers and facilitates their relationships (creation of network). The Vazapp’s initiative, considered in this study, is the Contadinners ('Farmer’s dinners'), a dinner held at farmer’s house where stakeholders living in the surrounding area know each other and are able to build a network for possible future professional collaborations. The aim of the paper is to identify the evolution of farmers’ relationships, both quantitatively and qualitatively, because of the Contadinner’s initiative organized by Vazapp. To this end, the study adopts the Social Network Analysis (SNA) methodology by using UCINET (Version 6.667) software to analyze the relational structure. Data collection was realized through a questionnaire distributed to 387 participants in the twenty 'Contadinners', held from February 2016 to June 2018. The response rate to the survey was about 50% of farmers. The elaboration data was focused on different aspects, such as: a) the measurement of relational reciprocity among the farmers using the symmetrize method of answers; b) the measurement of the answer reliability using the dichotomize method; c) the description of evolution of social capital using the cohesion method; d) the clustering of the Contadinners' participants in followers and not-followers of Vazapp to evaluate its impact on the local social capital. The results concern the effectiveness of this initiative in generating trustworthy relationships within the rural area of southern Italy, typically affected by individualism and mistrust. The number of relationships represents the quantitative indicator to define the dimension of the network development; while the typologies of relationships (from simple friendship to formal collaborations, for branding new cooperation initiatives) represents the qualitative indicator that offers a diversified perspective of the network impact. From the analysis carried out, Vazapp’s initiative represents surely a virtuous SI model to catalyze the relationships within the rural areas and to develop entrepreneurship based on the real needs of the community.

Keywords:

Procedia PDF Downloads 97
23890 The Duty of Application and Connection Providers Regarding the Supply of Internet Protocol by Court Order in Brazil to Determine Authorship of Acts Practiced on the Internet

Authors: João Pedro Albino, Ana Cláudia Pires Ferreira de Lima

Abstract:

Humanity has undergone a transformation from the physical to the virtual world, generating an enormous amount of data on the world wide web, known as big data. Many facts that occur in the physical world or in the digital world are proven through records made on the internet, such as digital photographs, posts on social media, contract acceptances by digital platforms, email, banking, and messaging applications, among others. These data recorded on the internet have been used as evidence in judicial proceedings. The identification of internet users is essential for the security of legal relationships. This research was carried out on scientific articles and materials from courses and lectures, with an analysis of Brazilian legislation and some judicial decisions on the request of static data from logs and Internet Protocols (IPs) from application and connection providers. In this article, we will address the determination of authorship of data processing on the internet by obtaining the IP address and the appropriate judicial procedure for this purpose under Brazilian law.

Keywords: IP address, digital forensics, big data, data analytics, information and communication technology

Procedia PDF Downloads 107
23889 Sourcing and Compiling a Maltese Traffic Dataset MalTra

Authors: Gabriele Borg, Alexei De Bono, Charlie Abela

Abstract:

There on a constant rise in the availability of high volumes of data gathered from multiple sources, resulting in an abundance of unprocessed information that can be used to monitor patterns and trends in user behaviour. Similarly, year after year, Malta is also constantly experiencing ongoing population growth and an increase in mobilization demand. This research takes advantage of data which is continuously being sourced and converting it into useful information related to the traffic problem on the Maltese roads. The scope of this paper is to provide a methodology to create a custom dataset (MalTra - Malta Traffic) compiled from multiple participants from various locations across the island to identify the most common routes taken to expose the main areas of activity. This use of big data is seen being used in various technologies and is referred to as ITSs (Intelligent Transportation Systems), which has been concluded that there is significant potential in utilising such sources of data on a nationwide scale.

Keywords: Big Data, vehicular traffic, traffic management, mobile data patterns

Procedia PDF Downloads 93