Search results for: data clustering
23918 Data Presentation of Lane-Changing Events Trajectories Using HighD Dataset
Authors: Basma Khelfa, Antoine Tordeux, Ibrahima Ba
Abstract:
We present a descriptive analysis data of lane-changing events in multi-lane roads. The data are provided from The Highway Drone Dataset (HighD), which are microscopic trajectories in highway. This paper describes and analyses the role of the different parameters and their significance. Thanks to HighD data, we aim to find the most frequent reasons that motivate drivers to change lanes. We used the programming language R for the processing of these data. We analyze the involvement and relationship of different variables of each parameter of the ego vehicle and the four vehicles surrounding it, i.e., distance, speed difference, time gap, and acceleration. This was studied according to the class of the vehicle (car or truck), and according to the maneuver it undertook (overtaking or falling back).Keywords: autonomous driving, physical traffic model, prediction model, statistical learning process
Procedia PDF Downloads 24323917 Evaluation of Golden Beam Data for the Commissioning of 6 and 18 MV Photons Beams in Varian Linear Accelerator
Authors: Shoukat Ali, Abdul Qadir Jandga, Amjad Hussain
Abstract:
Objective: The main purpose of this study is to compare the Percent Depth dose (PDD) and In-plane and cross-plane profiles of Varian Golden beam data to the measured data of 6 and 18 MV photons for the commissioning of Eclipse treatment planning system. Introduction: Commissioning of treatment planning system requires an extensive acquisition of beam data for the clinical use of linear accelerators. Accurate dose delivery require to enter the PDDs, Profiles and dose rate tables for open and wedges fields into treatment planning system, enabling to calculate the MUs and dose distribution. Varian offers a generic set of beam data as a reference data, however not recommend for clinical use. In this study, we compared the generic beam data with the measured beam data to evaluate the reliability of generic beam data to be used for the clinical purpose. Methods and Material: PDDs and Profiles of Open and Wedge fields for different field sizes and at different depths measured as per Varian’s algorithm commissioning guideline. The measurement performed with PTW 3D-scanning water phantom with semi-flex ion chamber and MEPHYSTO software. The online available Varian Golden Beam Data compared with the measured data to evaluate the accuracy of the golden beam data to be used for the commissioning of Eclipse treatment planning system. Results: The deviation between measured vs. golden beam data was in the range of 2% max. In PDDs, the deviation increases more in the deeper depths than the shallower depths. Similarly, profiles have the same trend of increasing deviation at large field sizes and increasing depths. Conclusion: Study shows that the percentage deviation between measured and golden beam data is within the acceptable tolerance and therefore can be used for the commissioning process; however, verification of small subset of acquired data with the golden beam data should be mandatory before clinical use.Keywords: percent depth dose, flatness, symmetry, golden beam data
Procedia PDF Downloads 47023916 Variable-Fidelity Surrogate Modelling with Kriging
Authors: Selvakumar Ulaganathan, Ivo Couckuyt, Francesco Ferranti, Tom Dhaene, Eric Laermans
Abstract:
Variable-fidelity surrogate modelling offers an efficient way to approximate function data available in multiple degrees of accuracy each with varying computational cost. In this paper, a Kriging-based variable-fidelity surrogate modelling approach is introduced to approximate such deterministic data. Initially, individual Kriging surrogate models, which are enhanced with gradient data of different degrees of accuracy, are constructed. Then these Gradient enhanced Kriging surrogate models are strategically coupled using a recursive CoKriging formulation to provide an accurate surrogate model for the highest fidelity data. While, intuitively, gradient data is useful to enhance the accuracy of surrogate models, the primary motivation behind this work is to investigate if it is also worthwhile incorporating gradient data of varying degrees of accuracy.Keywords: Kriging, CoKriging, Surrogate modelling, Variable- fidelity modelling, Gradients
Procedia PDF Downloads 54023915 Robust Barcode Detection with Synthetic-to-Real Data Augmentation
Authors: Xiaoyan Dai, Hsieh Yisan
Abstract:
Barcode processing of captured images is a huge challenge, as different shooting conditions can result in different barcode appearances. This paper proposes a deep learning-based barcode detection using synthetic-to-real data augmentation. We first augment barcodes themselves; we then augment images containing the barcodes to generate a large variety of data that is close to the actual shooting environments. Comparisons with previous works and evaluations with our original data show that this approach achieves state-of-the-art performance in various real images. In addition, the system uses hybrid resolution for barcode “scan” and is applicable to real-time applications.Keywords: barcode detection, data augmentation, deep learning, image-based processing
Procedia PDF Downloads 14323914 Forming Form, Motivation and Their Biolinguistic Hypothesis: The Case of Consonant Iconicity in Tashelhiyt Amazigh and English
Authors: Noury Bakrim
Abstract:
When dealing with motivation/arbitrariness, forming form (Forma Formans) and morphodynamics are to be grasped as relevant implications of enunciation/enactment, schematization within the specificity of language as sound/meaning articulation. Thus, the fact that a language is a form does not contradict stasis/dynamic enunciation (reflexivity vs double articulation). Moreover, some languages exemplify the role of the forming form, uttering, and schematization (roots in Semitic languages, the Chinese case). Beyond the evolutionary biosemiotic process (form/substance bifurcation, the split between realization/representation), non-isomorphism/asymmetry between linguistic form/norm and linguistic realization (phonetics for instance) opens up a new horizon problematizing the role of Brain – sensorimotor contribution in the continuous forming form. Therefore, we hypothesize biotization as both process/trace co-constructing motivation/forming form. Henceforth, referring to our findings concerning distribution and motivation patterns within Berber written texts (pulse based obstruents and nasal-lateral levels in poetry) and oral storytelling (consonant intensity clustering in quantitative and semantic/prosodic motivation), we understand consonant clustering, motivation and schematization as a complex phenomenon partaking in patterns of oral/written iconic prosody and reflexive metalinguistic representation opening the stable form. We focus our inquiry on both Amazigh and English clusters (/spl/, /spr/) and iconic consonant iteration in [gnunnuy] (to roll/tumble), [smummuy] (to moan sadly or crankily). For instance, the syllabic structures of /splaeʃ/ and /splaet/ imply an anamorphic representation of the state of the world: splash, impact on aquatic surfaces/splat impact on the ground. The pair has stridency and distribution as distinctive features which specify its phonetic realization (and a part of its meaning) /ʃ/ is [+ strident] and /t/ is [+ distributed] on the vocal tract. Schematization is then a process relating both physiology/code as an arthron vocal/bodily, vocal/practical shaping of the motor-articulatory system, leading to syntactic/semantic thematization (agent/patient roles in /spl/, /sm/ and other clusters or the tense uvular /qq/ at the initial position in Berber). Furthermore, the productivity of serial syllable sequencing in Berber points out different expressivity forms. We postulate two Components of motivated formalization: i) the process of memory paradigmatization relating to sequence modeling under sensorimotor/verbal specific categories (production/perception), ii) the process of phonotactic selection - prosodic unconscious/subconscious distribution by virtue of iconicity. Basing on multiple tests including a questionnaire, phonotactic/visual recognition and oral/written reproduction, we aim at patterning/conceptualizing consonant schematization and motivation among EFL and Amazigh (Berber) learners and speakers integrating biolinguistic hypotheses.Keywords: consonant motivation and prosody, language and order of life, anamorphic representation, represented representation, biotization, sensori-motor and brain representation, form, formalization and schematization
Procedia PDF Downloads 13123913 Spatial Analysis and Determinants of Number of Antenatal Health Care Visit Among Pregnant Women in Ethiopia: Application of Spatial Multilevel Count Regression Models
Authors: Muluwerk Ayele Derebe
Abstract:
Background: Antenatal care (ANC) is an essential element in the continuum of reproductive health care for preventing preventable pregnancy-related morbidity and mortality. Objective: The aim of this study is to assess the spatial pattern and predictors of ANC visits in Ethiopia. Method: This study was done using Ethiopian Demographic and Health Survey data of 2016 among 7,174 pregnant women aged 15-49 years which was a nationwide community-based cross-sectional survey. Spatial analysis was done using Getis-Ord Gi* statistics to identify hot and cold spot areas of ANC visits. Multilevel glmmTMB packages adjusted for spatial effects were used in R software. Spatial multilevel count regression was conducted to identify predictors of antenatal care visits for pregnant women, and proportional change in variance was done to uncover the effect of individual and community-level factors of ANC visits. Results: The distribution of ANC visits was spatially clustered Moran’s I = 0.271, p<.0.001, ICC = 0.497, p<0.001). The highest spatial outlier areas of ANC visit was found in Amhara (South Wollo, Weast Gojjam, North Shewa), Oromo (west Arsi and East Harariga), Tigray (Central Tigray) and Benishangul-Gumuz (Asosa and Metekel) regions. The data was found with excess zeros (34.6%) and over-dispersed. The expected ANC visit of pregnant women with pregnancy complications was higher at 0.7868 [ARR= 2.1964, 95% CI: 1.8605, 2.5928, p-value <0.0001] compared to pregnant women who had no pregnancy complications. The expected ANC visit of a pregnant woman who lived in a rural area was 1.2254 times higher [ARR=3.4057, 95% CI: 2.1462, 5.4041, p-value <0.0001] as compared to a pregnant woman who lived in an urban. The study found dissimilar clusters with a low number of zero counts for a mean number of ANC visits surrounded by clusters with a higher number of counts of an average number of ANC visits when other variables held constant. Conclusion: This study found that the number of ANC visits in Ethiopia had a spatial pattern associated with socioeconomic, demographic, and geographic risk factors. Spatial clustering of ANC visits exists in all regions of Ethiopia. The predictor age of the mother, religion, mother’s education, husband’s education, mother's occupation, husband's occupation, signs of pregnancy complication, wealth index and marital status had a strong association with the number of ANC visits by each individual. At the community level, place of residence, region, age of the mother, sex of the household head, signs of pregnancy complications and distance to health facility factors had a strong association with the number of ANC visits.Keywords: Ethiopia, ANC, spatial, multilevel, zero inflated Poisson
Procedia PDF Downloads 6023912 Analysis of Delivery of Quad Play Services
Authors: Rahul Malhotra, Anurag Sharma
Abstract:
Fiber based access networks can deliver performance that can support the increasing demands for high speed connections. One of the new technologies that have emerged in recent years is Passive Optical Networks. This paper is targeted to show the simultaneous delivery of triple play service (data, voice, and video). The comparative investigation and suitability of various data rates is presented. It is demonstrated that as we increase the data rate, number of users to be accommodated decreases due to increase in bit error rate.Keywords: FTTH, quad play, play service, access networks, data rate
Procedia PDF Downloads 39223911 Classification of Manufacturing Data for Efficient Processing on an Edge-Cloud Network
Authors: Onyedikachi Ulelu, Andrew P. Longstaff, Simon Fletcher, Simon Parkinson
Abstract:
The widespread interest in 'Industry 4.0' or 'digital manufacturing' has led to significant research requiring the acquisition of data from sensors, instruments, and machine signals. In-depth research then identifies methods of analysis of the massive amounts of data generated before and during manufacture to solve a particular problem. The ultimate goal is for industrial Internet of Things (IIoT) data to be processed automatically to assist with either visualisation or autonomous system decision-making. However, the collection and processing of data in an industrial environment come with a cost. Little research has been undertaken on how to specify optimally what data to capture, transmit, process, and store at various levels of an edge-cloud network. The first step in this specification is to categorise IIoT data for efficient and effective use. This paper proposes the required attributes and classification to take manufacturing digital data from various sources to determine the most suitable location for data processing on the edge-cloud network. The proposed classification framework will minimise overhead in terms of network bandwidth/cost and processing time of machine tool data via efficient decision making on which dataset should be processed at the ‘edge’ and what to send to a remote server (cloud). A fast-and-frugal heuristic method is implemented for this decision-making. The framework is tested using case studies from industrial machine tools for machine productivity and maintenance.Keywords: data classification, decision making, edge computing, industrial IoT, industry 4.0
Procedia PDF Downloads 16123910 Denoising Transient Electromagnetic Data
Authors: Lingerew Nebere Kassie, Ping-Yu Chang, Hsin-Hua Huang, , Chaw-Son Chen
Abstract:
Transient electromagnetic (TEM) data plays a crucial role in hydrogeological and environmental applications, providing valuable insights into geological structures and resistivity variations. However, the presence of noise often hinders the interpretation and reliability of these data. Our study addresses this issue by utilizing a FASTSNAP system for the TEM survey, which operates at different modes (low, medium, and high) with continuous adjustments to discretization, gain, and current. We employ a denoising approach that processes the raw data obtained from each acquisition mode to improve signal quality and enhance data reliability. We use a signal-averaging technique for each mode, increasing the signal-to-noise ratio. Additionally, we utilize wavelet transform to suppress noise further while preserving the integrity of the underlying signals. This approach significantly improves the data quality, notably suppressing severe noise at late times. The resulting denoised data exhibits a substantially improved signal-to-noise ratio, leading to increased accuracy in parameter estimation. By effectively denoising TEM data, our study contributes to a more reliable interpretation and analysis of underground structures. Moreover, the proposed denoising approach can be seamlessly integrated into existing ground-based TEM data processing workflows, facilitating the extraction of meaningful information from noisy measurements and enhancing the overall quality and reliability of the acquired data.Keywords: data quality, signal averaging, transient electromagnetic, wavelet transform
Procedia PDF Downloads 7323909 Attribute Analysis of Quick Response Code Payment Users Using Discriminant Non-negative Matrix Factorization
Authors: Hironori Karachi, Haruka Yamashita
Abstract:
Recently, the system of quick response (QR) code is getting popular. Many companies introduce new QR code payment services and the services are competing with each other to increase the number of users. For increasing the number of users, we should grasp the difference of feature of the demographic information, usage information, and value of users between services. In this study, we conduct an analysis of real-world data provided by Nomura Research Institute including the demographic data of users and information of users’ usages of two services; LINE Pay, and PayPay. For analyzing such data and interpret the feature of them, Nonnegative Matrix Factorization (NMF) is widely used; however, in case of the target data, there is a problem of the missing data. EM-algorithm NMF (EMNMF) to complete unknown values for understanding the feature of the given data presented by matrix shape. Moreover, for comparing the result of the NMF analysis of two matrices, there is Discriminant NMF (DNMF) shows the difference of users features between two matrices. In this study, we combine EMNMF and DNMF and also analyze the target data. As the interpretation, we show the difference of the features of users between LINE Pay and Paypay.Keywords: data science, non-negative matrix factorization, missing data, quality of services
Procedia PDF Downloads 11423908 Developing Guidelines for Public Health Nurse Data Management and Use in Public Health Emergencies
Authors: Margaret S. Wright
Abstract:
Background/Significance: During many recent public health emergencies/disasters, public health nursing data has been missing or delayed, potentially impacting the decision-making and response. Data used as evidence for decision-making in response, planning, and mitigation has been erratic and slow, decreasing the ability to respond. Methodology: Applying best practices in data management and data use in public health settings, and guided by the concepts outlined in ‘Disaster Standards of Care’ models leads to the development of recommendations for a model of best practices in data management and use in public health disasters/emergencies by public health nurses. As the ‘patient’ in public health disasters/emergencies is the community (local, regional or national), guidelines for patient documentation are incorporated in the recommendations. Findings: Using model public health nurses could better plan how to prepare for, respond to, and mitigate disasters in their communities, and better participate in decision-making in all three phases bringing public health nursing data to the discussion as part of the evidence base for decision-making.Keywords: data management, decision making, disaster planning documentation, public health nursing
Procedia PDF Downloads 20323907 Genodata: The Human Genome Variation Using BigData
Authors: Surabhi Maiti, Prajakta Tamhankar, Prachi Uttam Mehta
Abstract:
Since the accomplishment of the Human Genome Project, there has been an unparalled escalation in the sequencing of genomic data. This project has been the first major vault in the field of medical research, especially in genomics. This project won accolades by using a concept called Bigdata which was earlier, extensively used to gain value for business. Bigdata makes use of data sets which are generally in the form of files of size terabytes, petabytes, or exabytes and these data sets were traditionally used and managed using excel sheets and RDBMS. The voluminous data made the process tedious and time consuming and hence a stronger framework called Hadoop was introduced in the field of genetic sciences to make data processing faster and efficient. This paper focuses on using SPARK which is gaining momentum with the advancement of BigData technologies. Cloud Storage is an effective medium for storage of large data sets which is generated from the genetic research and the resultant sets produced from SPARK analysis.Keywords: human genome project, Bigdata, genomic data, SPARK, cloud storage, Hadoop
Procedia PDF Downloads 23623906 Identification of Text Domains and Register Variation through the Analysis of Lexical Distribution in a Bangla Mass Media Text Corpus
Authors: Mahul Bhattacharyya, Niladri Sekhar Dash
Abstract:
The present research paper is an experimental attempt to investigate the nature of variation in the register in three major text domains, namely, social, cultural, and political texts collected from the corpus of Bangla printed mass media texts. This present study uses a corpus of a moderate amount of Bangla mass media text that contains nearly one million words collected from different media sources like newspapers, magazines, advertisements, periodicals, etc. The analysis of corpus data reveals that each text has certain lexical properties that not only control their identity but also mark their uniqueness across the domains. At first, the subject domains of the texts are classified into two parameters namely, ‘Genre' and 'Text Type'. Next, some empirical investigations are made to understand how the domains vary from each other in terms of lexical properties like both function and content words. Here the method of comparative-cum-contrastive matching of lexical load across domains is invoked through word frequency count to track how domain-specific words and terms may be marked as decisive indicators in the act of specifying the textual contexts and subject domains. The study shows that the common lexical stock that percolates across all text domains are quite dicey in nature as their lexicological identity does not have any bearing in the act of specifying subject domains. Therefore, it becomes necessary for language users to anchor upon certain domain-specific lexical items to recognize a text that belongs to a specific text domain. The eventual findings of this study confirm that texts belonging to different subject domains in Bangla news text corpus clearly differ on the parameters of lexical load, lexical choice, lexical clustering, lexical collocation. In fact, based on these parameters, along with some statistical calculations, it is possible to classify mass media texts into different types to mark their relation with regard to the domains they should actually belong. The advantage of this analysis lies in the proper identification of the linguistic factors which will give language users a better insight into the method they employ in text comprehension, as well as construct a systemic frame for designing text identification strategy for language learners. The availability of huge amount of Bangla media text data is useful for achieving accurate conclusions with a certain amount of reliability and authenticity. This kind of corpus-based analysis is quite relevant for a resource-poor language like Bangla, as no attempt has ever been made to understand how the structure and texture of Bangla mass media texts vary due to certain linguistic and extra-linguistic constraints that are actively operational to specific text domains. Since mass media language is assumed to be the most 'recent representation' of the actual use of the language, this study is expected to show how the Bangla news texts reflect the thoughts of the society and how they leave a strong impact on the thought process of the speech community.Keywords: Bangla, corpus, discourse, domains, lexical choice, mass media, register, variation
Procedia PDF Downloads 16323905 Ontology for a Voice Transcription of OpenStreetMap Data: The Case of Space Apprehension by Visually Impaired Persons
Authors: Said Boularouk, Didier Josselin, Eitan Altman
Abstract:
In this paper, we present a vocal ontology of OpenStreetMap data for the apprehension of space by visually impaired people. Indeed, the platform based on produsage gives a freedom to data producers to choose the descriptors of geocoded locations. Unfortunately, this freedom, called also folksonomy leads to complicate subsequent searches of data. We try to solve this issue in a simple but usable method to extract data from OSM databases in order to send them to visually impaired people using Text To Speech technology. We focus on how to help people suffering from visual disability to plan their itinerary, to comprehend a map by querying computer and getting information about surrounding environment in a mono-modal human-computer dialogue.Keywords: TTS, ontology, open street map, visually impaired
Procedia PDF Downloads 27923904 Design and Development of a Platform for Analyzing Spatio-Temporal Data from Wireless Sensor Networks
Authors: Walid Fantazi
Abstract:
The development of sensor technology (such as microelectromechanical systems (MEMS), wireless communications, embedded systems, distributed processing and wireless sensor applications) has contributed to a broad range of WSN applications which are capable of collecting a large amount of spatiotemporal data in real time. These systems require real-time data processing to manage storage in real time and query the data they process. In order to cover these needs, we propose in this paper a Snapshot spatiotemporal data model based on object-oriented concepts. This model allows saving storing and reducing data redundancy which makes it easier to execute spatiotemporal queries and save analyzes time. Further, to ensure the robustness of the system as well as the elimination of congestion from the main access memory we propose a spatiotemporal indexing technique in RAM called Captree *. As a result, we offer an RIA (Rich Internet Application) -based SOA application architecture which allows the remote monitoring and control.Keywords: WSN, indexing data, SOA, RIA, geographic information system
Procedia PDF Downloads 23923903 Prediction of Marine Ecosystem Changes Based on the Integrated Analysis of Multivariate Data Sets
Authors: Prozorkevitch D., Mishurov A., Sokolov K., Karsakov L., Pestrikova L.
Abstract:
The current body of knowledge about the marine environment and the dynamics of marine ecosystems includes a huge amount of heterogeneous data collected over decades. It generally includes a wide range of hydrological, biological and fishery data. Marine researchers collect these data and analyze how and why the ecosystem changes from past to present. Based on these historical records and linkages between the processes it is possible to predict future changes. Multivariate analysis of trends and their interconnection in the marine ecosystem may be used as an instrument for predicting further ecosystem evolution. A wide range of information about the components of the marine ecosystem for more than 50 years needs to be used to investigate how these arrays can help to predict the future.Keywords: barents sea ecosystem, abiotic, biotic, data sets, trends, prediction
Procedia PDF Downloads 9623902 Optical Fiber Data Throughput in a Quantum Communication System
Authors: Arash Kosari, Ali Araghi
Abstract:
A mathematical model for an optical-fiber communication channel is developed which results in an expression that calculates the throughput and loss of the corresponding link. The data are assumed to be transmitted by using of separate photons with different polarizations. The derived model also shows the dependency of data throughput with length of the channel and depolarization factor. It is observed that absorption of photons affects the throughput in a more intensive way in comparison with that of depolarization. Apart from that, the probability of depolarization and the absorption of radiated photons are obtained.Keywords: absorption, data throughput, depolarization, optical fiber
Procedia PDF Downloads 27423901 Offshore Outsourcing: Global Data Privacy Controls and International Compliance Issues
Authors: Michelle J. Miller
Abstract:
In recent year, there has been a rise of two emerging issues that impact the global employment and business market that the legal community must review closer: offshore outsourcing and data privacy. These two issues intersect because employment opportunities are shifting due to offshore outsourcing and some States, like the United States, anti-outsourcing legislation has been passed or presented to retain jobs within the country. In addition, the legal requirements to retain the privacy of data as a global employer extends to employees and third party service provides, including services outsourced to offshore locations. For this reason, this paper will review the intersection of these two issues with a specific focus on data privacy.Keywords: outsourcing, data privacy, international compliance, multinational corporations
Procedia PDF Downloads 39523900 Weighted Data Replication Strategy for Data Grid Considering Economic Approach
Authors: N. Mansouri, A. Asadi
Abstract:
Data Grid is a geographically distributed environment that deals with data intensive application in scientific and enterprise computing. Data replication is a common method used to achieve efficient and fault-tolerant data access in Grids. In this paper, a dynamic data replication strategy, called Enhanced Latest Access Largest Weight (ELALW) is proposed. This strategy is an enhanced version of Latest Access Largest Weight strategy. However, replication should be used wisely because the storage capacity of each Grid site is limited. Thus, it is important to design an effective strategy for the replication replacement task. ELALW replaces replicas based on the number of requests in future, the size of the replica, and the number of copies of the file. It also improves access latency by selecting the best replica when various sites hold replicas. The proposed replica selection selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue and the distance between nodes. Simulation results utilizing the OptorSim show our replication strategy achieve better performance overall than other strategies in terms of job execution time, effective network usage and storage resource usage.Keywords: data grid, data replication, simulation, replica selection, replica placement
Procedia PDF Downloads 24523899 Evaluation of Satellite and Radar Rainfall Product over Seyhan Plain
Authors: Kazım Kaba, Erdem Erdi, M. Akif Erdoğan, H. Mustafa Kandırmaz
Abstract:
Rainfall is crucial data source for very different discipline such as agriculture, hydrology and climate. Therefore rain rate should be known well both spatial and temporal for any area. Rainfall is measured by using rain-gauge at meteorological ground stations traditionally for many years. At the present time, rainfall products are acquired from radar and satellite images with a temporal and spatial continuity. In this study, we investigated the accuracy of these rainfall data according to rain-gauge data. For this purpose, we used Adana-Hatay radar hourly total precipitation product (RN1) and Meteosat convective rainfall rate (CRR) product over Seyhan plain. We calculated daily rainfall values from RN1 and CRR hourly precipitation products. We used the data of rainy days of four stations located within range of the radar from October 2013 to November 2015. In the study, we examined two rainfall data over Seyhan plain and the correlation between the rain-gauge data and two raster rainfall data was observed lowly.Keywords: meteosat, radar, rainfall, rain-gauge, Turkey
Procedia PDF Downloads 30723898 Spatial Data Mining by Decision Trees
Authors: Sihem Oujdi, Hafida Belbachir
Abstract:
Existing methods of data mining cannot be applied on spatial data because they require spatial specificity consideration, as spatial relationships. This paper focuses on the classification with decision trees, which are one of the data mining techniques. We propose an extension of the C4.5 algorithm for spatial data, based on two different approaches Join materialization and Querying on the fly the different tables. Similar works have been done on these two main approaches, the first - Join materialization - favors the processing time in spite of memory space, whereas the second - Querying on the fly different tables- promotes memory space despite of the processing time. The modified C4.5 algorithm requires three entries tables: a target table, a neighbor table, and a spatial index join that contains the possible spatial relationship among the objects in the target table and those in the neighbor table. Thus, the proposed algorithms are applied to a spatial data pattern in the accidentology domain. A comparative study of our approach with other works of classification by spatial decision trees will be detailed.Keywords: C4.5 algorithm, decision trees, S-CART, spatial data mining
Procedia PDF Downloads 60023897 Data-Driven Dynamic Overbooking Model for Tour Operators
Authors: Kannapha Amaruchkul
Abstract:
We formulate a dynamic overbooking model for a tour operator, in which most reservations contain at least two people. The cancellation rate and the timing of the cancellation may depend on the group size. We propose two overbooking policies, namely economic- and service-based. In an economic-based policy, we want to minimize the expected oversold and underused cost, whereas, in a service-based policy, we ensure that the probability of an oversold situation does not exceed the pre-specified threshold. To illustrate the applicability of our approach, we use tour package data in 2016-2018 from a tour operator in Thailand to build a data-driven robust optimization model, and we tested the proposed overbooking policy in 2019. We also compare the data-driven approach to the conventional approach of fitting data into a probability distribution.Keywords: applied stochastic model, data-driven robust optimization, overbooking, revenue management, tour operator
Procedia PDF Downloads 11623896 Modeling and Statistical Analysis of a Soap Production Mix in Bejoy Manufacturing Industry, Anambra State, Nigeria
Authors: Okolie Chukwulozie Paul, Iwenofu Chinwe Onyedika, Sinebe Jude Ebieladoh, M. C. Nwosu
Abstract:
The research work is based on the statistical analysis of the processing data. The essence is to analyze the data statistically and to generate a design model for the production mix of soap manufacturing products in Bejoy manufacturing company Nkpologwu, Aguata Local Government Area, Anambra state, Nigeria. The statistical analysis shows the statistical analysis and the correlation of the data. T test, Partial correlation and bi-variate correlation were used to understand what the data portrays. The design model developed was used to model the data production yield and the correlation of the variables show that the R2 is 98.7%. However, the results confirm that the data is fit for further analysis and modeling. This was proved by the correlation and the R-squared.Keywords: General Linear Model, correlation, variables, pearson, significance, T-test, soap, production mix and statistic
Procedia PDF Downloads 42723895 Helping the Development of Public Policies with Knowledge of Criminal Data
Authors: Diego De Castro Rodrigues, Marcelo B. Nery, Sergio Adorno
Abstract:
The project aims to develop a framework for social data analysis, particularly by mobilizing criminal records and applying descriptive computational techniques, such as associative algorithms and extraction of tree decision rules, among others. The methods and instruments discussed in this work will enable the discovery of patterns, providing a guided means to identify similarities between recurring situations in the social sphere using descriptive techniques and data visualization. The study area has been defined as the city of São Paulo, with the structuring of social data as the central idea, with a particular focus on the quality of the information. Given this, a set of tools will be validated, including the use of a database and tools for visualizing the results. Among the main deliverables related to products and the development of articles are the discoveries made during the research phase. The effectiveness and utility of the results will depend on studies involving real data, validated both by domain experts and by identifying and comparing the patterns found in this study with other phenomena described in the literature. The intention is to contribute to evidence-based understanding and decision-making in the social field.Keywords: social data analysis, criminal records, computational techniques, data mining, big data
Procedia PDF Downloads 6523894 Enhancing the Bionic Eye: A Real-time Image Optimization Framework to Encode Color and Spatial Information Into Retinal Prostheses
Authors: William Huang
Abstract:
Retinal prostheses are currently limited to low resolution grayscale images that lack color and spatial information. This study develops a novel real-time image optimization framework and tools to encode maximum information to the prostheses which are constrained by the number of electrodes. One key idea is to localize main objects in images while reducing unnecessary background noise through region-contrast saliency maps. A novel color depth mapping technique was developed through MiniBatchKmeans clustering and color space selection. The resulting image was downsampled using bicubic interpolation to reduce image size while preserving color quality. In comparison to current schemes, the proposed framework demonstrated better visual quality in tested images. The use of the region-contrast saliency map showed improvements in efficacy up to 30%. Finally, the computational speed of this algorithm is less than 380 ms on tested cases, making real-time retinal prostheses feasible.Keywords: retinal implants, virtual processing unit, computer vision, saliency maps, color quantization
Procedia PDF Downloads 13123893 Optimization of Real Time Measured Data Transmission, Given the Amount of Data Transmitted
Authors: Michal Kopcek, Tomas Skulavik, Michal Kebisek, Gabriela Krizanova
Abstract:
The operation of nuclear power plants involves continuous monitoring of the environment in their area. This monitoring is performed using a complex data acquisition system, which collects status information about the system itself and values of many important physical variables e.g. temperature, humidity, dose rate etc. This paper describes a proposal and optimization of communication that takes place in teledosimetric system between the central control server responsible for the data processing and storing and the decentralized measuring stations, which are measuring the physical variables. Analyzes of ongoing communication were performed and consequently the optimization of the system architecture and communication was done.Keywords: communication protocol, transmission optimization, data acquisition, system architecture
Procedia PDF Downloads 50023892 A Fuzzy Approach to Liver Tumor Segmentation with Zernike Moments
Authors: Abder-Rahman Ali, Antoine Vacavant, Manuel Grand-Brochier, Adélaïde Albouy-Kissi, Jean-Yves Boire
Abstract:
In this paper, we present a new segmentation approach for liver lesions in regions of interest within MRI (Magnetic Resonance Imaging). This approach, based on a two-cluster Fuzzy C-Means methodology, considers the parameter variable compactness to handle uncertainty. Fine boundaries are detected by a local recursive merging of ambiguous pixels with a sequential forward floating selection with Zernike moments. The method has been tested on both synthetic and real images. When applied on synthetic images, the proposed approach provides good performance, segmentations obtained are accurate, their shape is consistent with the ground truth, and the extracted information is reliable. The results obtained on MR images confirm such observations. Our approach allows, even for difficult cases of MR images, to extract a segmentation with good performance in terms of accuracy and shape, which implies that the geometry of the tumor is preserved for further clinical activities (such as automatic extraction of pharmaco-kinetics properties, lesion characterization, etc).Keywords: defuzzification, floating search, fuzzy clustering, Zernike moments
Procedia PDF Downloads 44023891 Network Impact of a Social Innovation Initiative in Rural Areas of Southern Italy
Authors: A. M. Andriano, M. Lombardi, A. Lopolito, M. Prosperi, A. Stasi, E. Iannuzzi
Abstract:
In according to the scientific debate on the definition of Social Innovation (SI), the present paper identifies SI as new ideas (products, services, and models) that simultaneously meet social needs and create new social relationships or collaborations. This concept offers important tools to unravel the difficult condition for the agricultural sector in marginalized areas, characterized by the abandonment of activities, low level of farmer education, and low generational renewal, hampering new territorial strategies addressed at and integrated and sustainable development. Models of SI in agriculture, starting from bottom up approach or from the community, are considered to represent the driving force of an ecological and digital revolution. A system based on SI may be able to grasp and satisfy individual and social needs and to promote new forms of entrepreneurship. In this context, Vazapp ('Go Hoeing') is an emerging SI model in southern Italy that promotes solutions for satisfying needs of farmers and facilitates their relationships (creation of network). The Vazapp’s initiative, considered in this study, is the Contadinners ('Farmer’s dinners'), a dinner held at farmer’s house where stakeholders living in the surrounding area know each other and are able to build a network for possible future professional collaborations. The aim of the paper is to identify the evolution of farmers’ relationships, both quantitatively and qualitatively, because of the Contadinner’s initiative organized by Vazapp. To this end, the study adopts the Social Network Analysis (SNA) methodology by using UCINET (Version 6.667) software to analyze the relational structure. Data collection was realized through a questionnaire distributed to 387 participants in the twenty 'Contadinners', held from February 2016 to June 2018. The response rate to the survey was about 50% of farmers. The elaboration data was focused on different aspects, such as: a) the measurement of relational reciprocity among the farmers using the symmetrize method of answers; b) the measurement of the answer reliability using the dichotomize method; c) the description of evolution of social capital using the cohesion method; d) the clustering of the Contadinners' participants in followers and not-followers of Vazapp to evaluate its impact on the local social capital. The results concern the effectiveness of this initiative in generating trustworthy relationships within the rural area of southern Italy, typically affected by individualism and mistrust. The number of relationships represents the quantitative indicator to define the dimension of the network development; while the typologies of relationships (from simple friendship to formal collaborations, for branding new cooperation initiatives) represents the qualitative indicator that offers a diversified perspective of the network impact. From the analysis carried out, Vazapp’s initiative represents surely a virtuous SI model to catalyze the relationships within the rural areas and to develop entrepreneurship based on the real needs of the community. Procedia PDF Downloads 9723890 The Duty of Application and Connection Providers Regarding the Supply of Internet Protocol by Court Order in Brazil to Determine Authorship of Acts Practiced on the Internet
Authors: João Pedro Albino, Ana Cláudia Pires Ferreira de Lima
Abstract:
Humanity has undergone a transformation from the physical to the virtual world, generating an enormous amount of data on the world wide web, known as big data. Many facts that occur in the physical world or in the digital world are proven through records made on the internet, such as digital photographs, posts on social media, contract acceptances by digital platforms, email, banking, and messaging applications, among others. These data recorded on the internet have been used as evidence in judicial proceedings. The identification of internet users is essential for the security of legal relationships. This research was carried out on scientific articles and materials from courses and lectures, with an analysis of Brazilian legislation and some judicial decisions on the request of static data from logs and Internet Protocols (IPs) from application and connection providers. In this article, we will address the determination of authorship of data processing on the internet by obtaining the IP address and the appropriate judicial procedure for this purpose under Brazilian law.Keywords: IP address, digital forensics, big data, data analytics, information and communication technology
Procedia PDF Downloads 10723889 Sourcing and Compiling a Maltese Traffic Dataset MalTra
Authors: Gabriele Borg, Alexei De Bono, Charlie Abela
Abstract:
There on a constant rise in the availability of high volumes of data gathered from multiple sources, resulting in an abundance of unprocessed information that can be used to monitor patterns and trends in user behaviour. Similarly, year after year, Malta is also constantly experiencing ongoing population growth and an increase in mobilization demand. This research takes advantage of data which is continuously being sourced and converting it into useful information related to the traffic problem on the Maltese roads. The scope of this paper is to provide a methodology to create a custom dataset (MalTra - Malta Traffic) compiled from multiple participants from various locations across the island to identify the most common routes taken to expose the main areas of activity. This use of big data is seen being used in various technologies and is referred to as ITSs (Intelligent Transportation Systems), which has been concluded that there is significant potential in utilising such sources of data on a nationwide scale.Keywords: Big Data, vehicular traffic, traffic management, mobile data patterns
Procedia PDF Downloads 93