Search results for: missing data
24816 Integration Process and Analytic Interface of different Environmental Open Data Sets with Java/Oracle and R
Authors: Pavel H. Llamocca, Victoria Lopez
Abstract:
The main objective of our work is the comparative analysis of environmental data from Open Data bases, belonging to different governments. This means that you have to integrate data from various different sources. Nowadays, many governments have the intention of publishing thousands of data sets for people and organizations to use them. In this way, the quantity of applications based on Open Data is increasing. However each government has its own procedures to publish its data, and it causes a variety of formats of data sets because there are no international standards to specify the formats of the data sets from Open Data bases. Due to this variety of formats, we must build a data integration process that is able to put together all kind of formats. There are some software tools developed in order to give support to the integration process, e.g. Data Tamer, Data Wrangler. The problem with these tools is that they need data scientist interaction to take part in the integration process as a final step. In our case we don’t want to depend on a data scientist, because environmental data are usually similar and these processes can be automated by programming. The main idea of our tool is to build Hadoop procedures adapted to data sources per each government in order to achieve an automated integration. Our work focus in environment data like temperature, energy consumption, air quality, solar radiation, speeds of wind, etc. Since 2 years, the government of Madrid is publishing its Open Data bases relative to environment indicators in real time. In the same way, other governments have published Open Data sets relative to the environment (like Andalucia or Bilbao). But all of those data sets have different formats and our solution is able to integrate all of them, furthermore it allows the user to make and visualize some analysis over the real-time data. Once the integration task is done, all the data from any government has the same format and the analysis process can be initiated in a computational better way. So the tool presented in this work has two goals: 1. Integration process; and 2. Graphic and analytic interface. As a first approach, the integration process was developed using Java and Oracle and the graphic and analytic interface with Java (jsp). However, in order to open our software tool, as second approach, we also developed an implementation with R language as mature open source technology. R is a really powerful open source programming language that allows us to process and analyze a huge amount of data with high performance. There are also some R libraries for the building of a graphic interface like shiny. A performance comparison between both implementations was made and no significant differences were found. In addition, our work provides with an Official Real-Time Integrated Data Set about Environment Data in Spain to any developer in order that they can build their own applications.Keywords: open data, R language, data integration, environmental data
Procedia PDF Downloads 31524815 Dual-use UAVs in Armed Conflicts: Opportunities and Risks for Cyber and Electronic Warfare
Authors: Piret Pernik
Abstract:
Based on strategic, operational, and technical analysis of the ongoing armed conflict in Ukraine, this paper will examine the opportunities and risks of using small commercial drones (dual-use unmanned aerial vehicles, UAV) for military purposes. The paper discusses the opportunities and risks in the information domain, encompassing both cyber and electromagnetic interference and attacks. The paper will draw conclusions on a possible strategic impact to the battlefield outcomes in the modern armed conflicts by the widespread use of dual-use UAVs. This article will contribute to filling the gap in the literature by examining based on empirical data cyberattacks and electromagnetic interference. Today, more than one hundred states and non-state actors possess UAVs ranging from low cost commodity models, widely are dual-use, available and affordable to anyone, to high-cost combat UAVs (UCAV) with lethal kinetic strike capabilities, which can be enhanced with Artificial Intelligence (AI) and Machine Learning (ML). Dual-use UAVs have been used by various actors for intelligence, reconnaissance, surveillance, situational awareness, geolocation, and kinetic targeting. Thus they function as force multipliers enabling kinetic and electronic warfare attacks and provide comparative and asymmetric operational and tactical advances. Some go as far as argue that automated (or semi-automated) systems can change the character of warfare, while others observe that the use of small drones has not changed the balance of power or battlefield outcomes. UAVs give considerable opportunities for commanders, for example, because they can be operated without GPS navigation, makes them less vulnerable and dependent on satellite communications. They can and have been used to conduct cyberattacks, electromagnetic interference, and kinetic attacks. However, they are highly vulnerable to those attacks themselves. So far, strategic studies, literature, and expert commentary have overlooked cybersecurity and electronic interference dimension of the use of dual use UAVs. The studies that link technical analysis of opportunities and risks with strategic battlefield outcomes is missing. It is expected that dual use commercial UAV proliferation in armed and hybrid conflicts will continue and accelerate in the future. Therefore, it is important to understand specific opportunities and risks related to the crowdsourced use of dual-use UAVs, which can have kinetic effects. Technical countermeasures to protect UAVs differ depending on a type of UAV (small, midsize, large, stealth combat), and this paper will offer a unique analysis of small UAVs both from the view of opportunities and risks for commanders and other actors in armed conflict.Keywords: dual-use technology, cyber attacks, electromagnetic warfare, case studies of cyberattacks in armed conflicts
Procedia PDF Downloads 10224814 Transforming Data into Knowledge: Mathematical and Statistical Innovations in Data Analytics
Authors: Zahid Ullah, Atlas Khan
Abstract:
The rapid growth of data in various domains has created a pressing need for effective methods to transform this data into meaningful knowledge. In this era of big data, mathematical and statistical innovations play a crucial role in unlocking insights and facilitating informed decision-making in data analytics. This abstract aims to explore the transformative potential of these innovations and their impact on converting raw data into actionable knowledge. Drawing upon a comprehensive review of existing literature, this research investigates the cutting-edge mathematical and statistical techniques that enable the conversion of data into knowledge. By evaluating their underlying principles, strengths, and limitations, we aim to identify the most promising innovations in data analytics. To demonstrate the practical applications of these innovations, real-world datasets will be utilized through case studies or simulations. This empirical approach will showcase how mathematical and statistical innovations can extract patterns, trends, and insights from complex data, enabling evidence-based decision-making across diverse domains. Furthermore, a comparative analysis will be conducted to assess the performance, scalability, interpretability, and adaptability of different innovations. By benchmarking against established techniques, we aim to validate the effectiveness and superiority of the proposed mathematical and statistical innovations in data analytics. Ethical considerations surrounding data analytics, such as privacy, security, bias, and fairness, will be addressed throughout the research. Guidelines and best practices will be developed to ensure the responsible and ethical use of mathematical and statistical innovations in data analytics. The expected contributions of this research include advancements in mathematical and statistical sciences, improved data analysis techniques, enhanced decision-making processes, and practical implications for industries and policymakers. The outcomes will guide the adoption and implementation of mathematical and statistical innovations, empowering stakeholders to transform data into actionable knowledge and drive meaningful outcomes.Keywords: data analytics, mathematical innovations, knowledge extraction, decision-making
Procedia PDF Downloads 7524813 FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule
Authors: Lu Si, Jie Yu, Shasha Li, Jun Ma, Lei Luo, Qingbo Wu, Yongqi Ma, Zhengji Liu
Abstract:
Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.Keywords: instance selection, data reduction, MapReduce, kNN
Procedia PDF Downloads 25424812 A Design Framework for an Open Market Platform of Enriched Card-Based Transactional Data for Big Data Analytics and Open Banking
Authors: Trevor Toy, Josef Langerman
Abstract:
Around a quarter of the world’s data is generated by financial with an estimated 708.5 billion global non-cash transactions reached between 2018 and. And with Open Banking still a rapidly developing concept within the financial industry, there is an opportunity to create a secure mechanism for connecting its stakeholders to openly, legitimately and consensually share the data required to enable it. Integration and data sharing of anonymised transactional data are still operated in silos and centralised between the large corporate entities in the ecosystem that have the resources to do so. Smaller fintechs generating data and businesses looking to consume data are largely excluded from the process. Therefore there is a growing demand for accessible transactional data for analytical purposes and also to support the rapid global adoption of Open Banking. The following research has provided a solution framework that aims to provide a secure decentralised marketplace for 1.) data providers to list their transactional data, 2.) data consumers to find and access that data, and 3.) data subjects (the individuals making the transactions that generate the data) to manage and sell the data that relates to themselves. The platform also provides an integrated system for downstream transactional-related data from merchants, enriching the data product available to build a comprehensive view of a data subject’s spending habits. A robust and sustainable data market can be developed by providing a more accessible mechanism for data producers to monetise their data investments and encouraging data subjects to share their data through the same financial incentives. At the centre of the platform is the market mechanism that connects the data providers and their data subjects to the data consumers. This core component of the platform is developed on a decentralised blockchain contract with a market layer that manages transaction, user, pricing, payment, tagging, contract, control, and lineage features that pertain to the user interactions on the platform. One of the platform’s key features is enabling the participation and management of personal data by the individuals from whom the data is being generated. This framework developed a proof-of-concept on the Etheruem blockchain base where an individual can securely manage access to their own personal data and that individual’s identifiable relationship to the card-based transaction data provided by financial institutions. This gives data consumers access to a complete view of transactional spending behaviour in correlation to key demographic information. This platform solution can ultimately support the growth, prosperity, and development of economies, businesses, communities, and individuals by providing accessible and relevant transactional data for big data analytics and open banking.Keywords: big data markets, open banking, blockchain, personal data management
Procedia PDF Downloads 7424811 Vertically Coupled III-V/Silicon Single Mode Laser with a Hybrid Grating Structure
Abstract:
Silicon photonics has gained much interest and extensive research for a promising aspect for fabricating compact, high-speed and low-cost photonic devices compatible with complementary metal-oxide-semiconductor (CMOS) process. Despite the remarkable progress made on the development of silicon photonics, high-performance, cost-effective, and reliable silicon laser sources are still missing. In this work, we present a 1550 nm III-V/silicon laser design with stable single-mode lasing property and robust and high-efficiency vertical coupling. The InP cavity consists of two uniform Bragg grating sections at sides for mode selection and feedback, as well as a central second-order grating for surface emission. A grating coupler is etched on the SOI waveguide by which the light coupling between the parallel III-V and SOI is reached vertically rather than by evanescent wave coupling. Laser characteristic is simulated and optimized by the traveling-wave model (TWM) and a Green’s function analysis as well as a 2D finite difference time domain (FDTD) method for the coupling process. The simulation results show that single-mode lasing with SMSR better than 48dB is achievable, and the threshold current is less than 15mA with a slope efficiency of around 0.13W/A. The coupling efficiency is larger than 42% and possesses a high tolerance with less than 10% reduction for 10 um horizontal or 15 um vertical dislocation. The design can be realized by standard flip-chip bonding techniques without co-fabrication of III-V and silicon or precise alignment.Keywords: III-V/silicon integration, silicon photonics, single mode laser, vertical coupling
Procedia PDF Downloads 15624810 Analyze the Effect of TETRA, Terrestrial Trunked Radio, Signal on the Health of People Working in the Gas Refinery
Authors: Mohammad Bagher Heidari, Hefzollah Mohammadian
Abstract:
TETRA (Terrestrial Trunked Radio) is a digital radio communication standard, which has been implemented in several different parts of the gas refinery ninth (phase 12th) by South Pars Gas Complex. Studies on possible impacts on the users' health considering different exposure conditions are missing. Objectives: To investigate possible acute effects of electromagnetic fields (EMF) of two different levels of TETRA hand-held transmitter signals on cognitive function and well-being in healthy young males. Methods: In the present double-blind cross-over study possible effects of short-term (2.5 h) EMF exposure of handset-like signals of TETRA (450 - 470 MHz) were studied in 30 healthy male participants (mean ± SD: 25.4 ±2.6 years). Individuals were tested on nine study days, on which they were exposed to three different exposure conditions (Sham, TETRA 1.5 W/kg and TETRA 10.0 W/kg) in a randomly assigned and balanced order. Participants were tested in the afternoon at a fixed timeframe. Results: Attention remained unchanged in two out of three tasks. In the working memory, significant changes were observed in two out of four subtasks. Significant results were found in 5 out of 35 tested parameters, four of them led to an improvement in performance. Mood, well-being and subjective somatic complaints were not affected by TETRA exposure. Conclusions: The results of the present study do not indicate a negative impact of a short-term EMF- effect of TETRA on cognitive function and well-being in healthy young men.Keywords: TETRA (terrestrial trunked radio), electromagnetic fields (EMF), mobile telecommunication health research (MTHR), antenna
Procedia PDF Downloads 29724809 Experimental Evaluation of Succinct Ternary Tree
Authors: Dmitriy Kuptsov
Abstract:
Tree data structures, such as binary or in general k-ary trees, are essential in computer science. The applications of these data structures can range from data search and retrieval to sorting and ranking algorithms. Naive implementations of these data structures can consume prohibitively large volumes of random access memory limiting their applicability in certain solutions. Thus, in these cases, more advanced representation of these data structures is essential. In this paper we present the design of the compact version of ternary tree data structure and demonstrate the results for the experimental evaluation using static dictionary problem. We compare these results with the results for binary and regular ternary trees. The conducted evaluation study shows that our design, in the best case, consumes up to 12 times less memory (for the dictionary used in our experimental evaluation) than a regular ternary tree and in certain configuration shows performance comparable to regular ternary trees. We have evaluated the performance of the algorithms using both 32 and 64 bit operating systems.Keywords: algorithms, data structures, succinct ternary tree, per- formance evaluation
Procedia PDF Downloads 16024808 Predicting Data Center Resource Usage Using Quantile Regression to Conserve Energy While Fulfilling the Service Level Agreement
Authors: Ahmed I. Alutabi, Naghmeh Dezhabad, Sudhakar Ganti
Abstract:
Data centers have been growing in size and dema nd continuously in the last two decades. Planning for the deployment of resources has been shallow and always resorted to over-provisioning. Data center operators try to maximize the availability of their services by allocating multiple of the needed resources. One resource that has been wasted, with little thought, has been energy. In recent years, programmable resource allocation has paved the way to allow for more efficient and robust data centers. In this work, we examine the predictability of resource usage in a data center environment. We use a number of models that cover a wide spectrum of machine learning categories. Then we establish a framework to guarantee the client service level agreement (SLA). Our results show that using prediction can cut energy loss by up to 55%.Keywords: machine learning, artificial intelligence, prediction, data center, resource allocation, green computing
Procedia PDF Downloads 10924807 Prosperous Digital Image Watermarking Approach by Using DCT-DWT
Authors: Prabhakar C. Dhavale, Meenakshi M. Pawar
Abstract:
In this paper, everyday tons of data is embedded on digital media or distributed over the internet. The data is so distributed that it can easily be replicated without error, putting the rights of their owners at risk. Even when encrypted for distribution, data can easily be decrypted and copied. One way to discourage illegal duplication is to insert information known as watermark, into potentially valuable data in such a way that it is impossible to separate the watermark from the data. These challenges motivated researchers to carry out intense research in the field of watermarking. A watermark is a form, image or text that is impressed onto paper, which provides evidence of its authenticity. Digital watermarking is an extension of the same concept. There are two types of watermarks visible watermark and invisible watermark. In this project, we have concentrated on implementing watermark in image. The main consideration for any watermarking scheme is its robustness to various attacksKeywords: watermarking, digital, DCT-DWT, security
Procedia PDF Downloads 42324806 Machine Learning Data Architecture
Authors: Neerav Kumar, Naumaan Nayyar, Sharath Kashyap
Abstract:
Most companies see an increase in the adoption of machine learning (ML) applications across internal and external-facing use cases. ML applications vend output either in batch or real-time patterns. A complete batch ML pipeline architecture comprises data sourcing, feature engineering, model training, model deployment, model output vending into a data store for downstream application. Due to unclear role expectations, we have observed that scientists specializing in building and optimizing models are investing significant efforts into building the other components of the architecture, which we do not believe is the best use of scientists’ bandwidth. We propose a system architecture created using AWS services that bring industry best practices to managing the workflow and simplifies the process of model deployment and end-to-end data integration for an ML application. This narrows down the scope of scientists’ work to model building and refinement while specialized data engineers take over the deployment, pipeline orchestration, data quality, data permission system, etc. The pipeline infrastructure is built and deployed as code (using terraform, cdk, cloudformation, etc.) which makes it easy to replicate and/or extend the architecture to other models that are used in an organization.Keywords: data pipeline, machine learning, AWS, architecture, batch machine learning
Procedia PDF Downloads 6524805 A Comparison of Image Data Representations for Local Stereo Matching
Authors: André Smith, Amr Abdel-Dayem
Abstract:
The stereo matching problem, while having been present for several decades, continues to be an active area of research. The goal of this research is to find correspondences between elements found in a set of stereoscopic images. With these pairings, it is possible to infer the distance of objects within a scene, relative to the observer. Advancements in this field have led to experimentations with various techniques, from graph-cut energy minimization to artificial neural networks. At the basis of these techniques is a cost function, which is used to evaluate the likelihood of a particular match between points in each image. While at its core, the cost is based on comparing the image pixel data; there is a general lack of consistency as to what image data representation to use. This paper presents an experimental analysis to compare the effectiveness of more common image data representations. The goal is to determine the effectiveness of these data representations to reduce the cost for the correct correspondence relative to other possible matches.Keywords: colour data, local stereo matching, stereo correspondence, disparity map
Procedia PDF Downloads 37124804 Design of Demand Pacemaker Using an Embedded Controller
Authors: C. Bala Prashanth Reddy, B. Abhinay, C. Sreekar, D. V. Shobhana Priscilla
Abstract:
The project aims in designing an emergency pacemaker which is capable of giving shocks to a human heart which has stopped working suddenly. A pacemaker is a machine commonly used by cardiologists. This machine is used in order to shock a human’s heart back into usage. The way the heart works is that there are small cells called pacemakers sending electrical pulses to cardiac muscles that tell the heart when to pump blood. When these electrical pulses stop, the heart stops beating. When this happens, a pacemaker is used to shock the heart muscles and the pacemakers back into action. The way this is achieved is by rubbing the two panels of the pacemaker together to create an adequate electrical current, and then the heart gets back to the normal state. The project aims in designing a system which is capable of continuously displaying the heart beat and blood pressure of a person on LCD. The concerned doctor gets the heart beat and also the blood pressure details continuously through the GSM Modem in the form of SMS alerts. In case of abnormal condition, the doctor sends message format regarding the amount of electric shock needed. Automatically the microcontroller gives the input to the pacemaker which in turn gives the shock to the patient. Heart beat monitor and display system is a portable and a best replacement for the old model stethoscope which is less efficient. The heart beat rate is calculated manually using stethoscope where the probability of error is high because the heart beat rate lies in the range of 70 to 90 per minute whose occurrence is less than 1 sec, so this device can be considered as a very good alternative instead of a stethoscope.Keywords: missing R wave, PWM, demand pacemaker, heart
Procedia PDF Downloads 48224803 Business-Intelligence Mining of Large Decentralized Multimedia Datasets with a Distributed Multi-Agent System
Authors: Karima Qayumi, Alex Norta
Abstract:
The rapid generation of high volume and a broad variety of data from the application of new technologies pose challenges for the generation of business-intelligence. Most organizations and business owners need to extract data from multiple sources and apply analytical methods for the purposes of developing their business. Therefore, the recently decentralized data management environment is relying on a distributed computing paradigm. While data are stored in highly distributed systems, the implementation of distributed data-mining techniques is a challenge. The aim of this technique is to gather knowledge from every domain and all the datasets stemming from distributed resources. As agent technologies offer significant contributions for managing the complexity of distributed systems, we consider this for next-generation data-mining processes. To demonstrate agent-based business intelligence operations, we use agent-oriented modeling techniques to develop a new artifact for mining massive datasets.Keywords: agent-oriented modeling (AOM), business intelligence model (BIM), distributed data mining (DDM), multi-agent system (MAS)
Procedia PDF Downloads 43224802 Timing and Noise Data Mining Algorithm and Software Tool in Very Large Scale Integration (VLSI) Design
Authors: Qing K. Zhu
Abstract:
Very Large Scale Integration (VLSI) design becomes very complex due to the continuous integration of millions of gates in one chip based on Moore’s law. Designers have encountered numerous report files during design iterations using timing and noise analysis tools. This paper presented our work using data mining techniques combined with HTML tables to extract and represent critical timing/noise data. When we apply this data-mining tool in real applications, the running speed is important. The software employs table look-up techniques in the programming for the reasonable running speed based on performance testing results. We added several advanced features for the application in one industry chip design.Keywords: VLSI design, data mining, big data, HTML forms, web, VLSI, EDA, timing, noise
Procedia PDF Downloads 25424801 Introduction of Electronic Health Records to Improve Data Quality in Emergency Department Operations
Authors: Anuruddha Jagoda, Samiddhi Samarakoon, Anil Jasinghe
Abstract:
In its simplest form, data quality can be defined as 'fitness for use' and it is a concept with multi-dimensions. Emergency Departments(ED) require information to treat patients and on the other hand it is the primary source of information regarding accidents, injuries, emergencies etc. Also, it is the starting point of various patient registries, databases and surveillance systems. This interventional study was carried out to improve data quality at the ED of the National Hospital of Sri Lanka (NHSL) by introducing an e health solution to improve data quality. The NHSL is the premier trauma care centre in Sri Lanka. The study consisted of three components. A research study was conducted to assess the quality of data in relation to selected five dimensions of data quality namely accuracy, completeness, timeliness, legibility and reliability. The intervention was to develop and deploy an electronic emergency department information system (eEDIS). Post assessment of the intervention confirmed that all five dimensions of data quality had improved. The most significant improvements are noticed in accuracy and timeliness dimensions.Keywords: electronic health records, electronic emergency department information system, emergency department, data quality
Procedia PDF Downloads 27624800 Data Presentation of Lane-Changing Events Trajectories Using HighD Dataset
Authors: Basma Khelfa, Antoine Tordeux, Ibrahima Ba
Abstract:
We present a descriptive analysis data of lane-changing events in multi-lane roads. The data are provided from The Highway Drone Dataset (HighD), which are microscopic trajectories in highway. This paper describes and analyses the role of the different parameters and their significance. Thanks to HighD data, we aim to find the most frequent reasons that motivate drivers to change lanes. We used the programming language R for the processing of these data. We analyze the involvement and relationship of different variables of each parameter of the ego vehicle and the four vehicles surrounding it, i.e., distance, speed difference, time gap, and acceleration. This was studied according to the class of the vehicle (car or truck), and according to the maneuver it undertook (overtaking or falling back).Keywords: autonomous driving, physical traffic model, prediction model, statistical learning process
Procedia PDF Downloads 26224799 Evaluation of Golden Beam Data for the Commissioning of 6 and 18 MV Photons Beams in Varian Linear Accelerator
Authors: Shoukat Ali, Abdul Qadir Jandga, Amjad Hussain
Abstract:
Objective: The main purpose of this study is to compare the Percent Depth dose (PDD) and In-plane and cross-plane profiles of Varian Golden beam data to the measured data of 6 and 18 MV photons for the commissioning of Eclipse treatment planning system. Introduction: Commissioning of treatment planning system requires an extensive acquisition of beam data for the clinical use of linear accelerators. Accurate dose delivery require to enter the PDDs, Profiles and dose rate tables for open and wedges fields into treatment planning system, enabling to calculate the MUs and dose distribution. Varian offers a generic set of beam data as a reference data, however not recommend for clinical use. In this study, we compared the generic beam data with the measured beam data to evaluate the reliability of generic beam data to be used for the clinical purpose. Methods and Material: PDDs and Profiles of Open and Wedge fields for different field sizes and at different depths measured as per Varian’s algorithm commissioning guideline. The measurement performed with PTW 3D-scanning water phantom with semi-flex ion chamber and MEPHYSTO software. The online available Varian Golden Beam Data compared with the measured data to evaluate the accuracy of the golden beam data to be used for the commissioning of Eclipse treatment planning system. Results: The deviation between measured vs. golden beam data was in the range of 2% max. In PDDs, the deviation increases more in the deeper depths than the shallower depths. Similarly, profiles have the same trend of increasing deviation at large field sizes and increasing depths. Conclusion: Study shows that the percentage deviation between measured and golden beam data is within the acceptable tolerance and therefore can be used for the commissioning process; however, verification of small subset of acquired data with the golden beam data should be mandatory before clinical use.Keywords: percent depth dose, flatness, symmetry, golden beam data
Procedia PDF Downloads 49024798 Variable-Fidelity Surrogate Modelling with Kriging
Authors: Selvakumar Ulaganathan, Ivo Couckuyt, Francesco Ferranti, Tom Dhaene, Eric Laermans
Abstract:
Variable-fidelity surrogate modelling offers an efficient way to approximate function data available in multiple degrees of accuracy each with varying computational cost. In this paper, a Kriging-based variable-fidelity surrogate modelling approach is introduced to approximate such deterministic data. Initially, individual Kriging surrogate models, which are enhanced with gradient data of different degrees of accuracy, are constructed. Then these Gradient enhanced Kriging surrogate models are strategically coupled using a recursive CoKriging formulation to provide an accurate surrogate model for the highest fidelity data. While, intuitively, gradient data is useful to enhance the accuracy of surrogate models, the primary motivation behind this work is to investigate if it is also worthwhile incorporating gradient data of varying degrees of accuracy.Keywords: Kriging, CoKriging, Surrogate modelling, Variable- fidelity modelling, Gradients
Procedia PDF Downloads 55824797 Robust Barcode Detection with Synthetic-to-Real Data Augmentation
Authors: Xiaoyan Dai, Hsieh Yisan
Abstract:
Barcode processing of captured images is a huge challenge, as different shooting conditions can result in different barcode appearances. This paper proposes a deep learning-based barcode detection using synthetic-to-real data augmentation. We first augment barcodes themselves; we then augment images containing the barcodes to generate a large variety of data that is close to the actual shooting environments. Comparisons with previous works and evaluations with our original data show that this approach achieves state-of-the-art performance in various real images. In addition, the system uses hybrid resolution for barcode “scan” and is applicable to real-time applications.Keywords: barcode detection, data augmentation, deep learning, image-based processing
Procedia PDF Downloads 17324796 Barrier Membrane Influence Histology of Guided Bone Regenerations: A Systematic Review and Meta-Analysis
Authors: Laura Canagueral-Pellice, Antonio Munar-Frau, Adaia Valls-Ontanon, Joao Carames, Federico Hernandez-Alfaro, Jordi Caballe-Serrano
Abstract:
Objective: Guided bone regeneration (GBR) aims to replace the missing bone with a new structure to achieve long-term stability of rehabilitations. The aim of the present systematic review and meta-analysis is to determine the effect of barrier membranes on histological outcomes after GBR procedures. Moreover, the effect of the grafting material and tissue gain were analyzed. Materials & methods: Two independent reviewers performed an electronic search in Pubmed and Scopus, identifying all eligible publications up to March 2020. Only randomized controlled trials (RCTs) assessing a histological analysis of augmented areas were included. Results: A total of 6 publications were included for the present systematic review. A total of 110 biopsied sites were analysed; 10 corresponded to vertical bone augmentation procedures, whereas 100 analysed horizontal regeneration procedures. A mean tissue gain of 3 ± 1.48mm was obtained for horizontal defects. Histological assessment of new bone formation, residual particle and sub-epithelial connective tissue (SCT) was reported. The four main barrier membranes used were natural collagen membranes, e-PTFE, polylactic resorbable membranes and acellular dermal matrix membranes (AMDG). The analysis demonstrated that resorbable membranes result in higher values of new bone formation and lower values of residual particles and SCT. Xenograft resulted in lower new bone formation compared to allograft; however, no statistically significant differences were observed regarding residual particle and SCT. Overall, regeneration procedures adding autogenous bone, plasma derivate or growth factors achieved in general greater new bone formation and tissue gain. Conclusions: There is limited evidence favoring the effect of a certain type of barrier membrane in GBR. Data needs to be evaluated carefully; however, resorbable membranes are correlated with greater new bone formation values, especially when combined with allograft materials and/or the addition of autogenous bone, platelet reach plasma (PRP) or growth factors in the regeneration area. More studies assessing the histological outcomes of different GBR protocols and procedures testing different biomaterials are needed to maximize the clinical and histological outcomes in bone regeneration science.Keywords: barrier membrane, graft material, guided bone regeneration, implant surgery, histology
Procedia PDF Downloads 15224795 Analysis of Delivery of Quad Play Services
Authors: Rahul Malhotra, Anurag Sharma
Abstract:
Fiber based access networks can deliver performance that can support the increasing demands for high speed connections. One of the new technologies that have emerged in recent years is Passive Optical Networks. This paper is targeted to show the simultaneous delivery of triple play service (data, voice, and video). The comparative investigation and suitability of various data rates is presented. It is demonstrated that as we increase the data rate, number of users to be accommodated decreases due to increase in bit error rate.Keywords: FTTH, quad play, play service, access networks, data rate
Procedia PDF Downloads 41724794 Classification of Manufacturing Data for Efficient Processing on an Edge-Cloud Network
Authors: Onyedikachi Ulelu, Andrew P. Longstaff, Simon Fletcher, Simon Parkinson
Abstract:
The widespread interest in 'Industry 4.0' or 'digital manufacturing' has led to significant research requiring the acquisition of data from sensors, instruments, and machine signals. In-depth research then identifies methods of analysis of the massive amounts of data generated before and during manufacture to solve a particular problem. The ultimate goal is for industrial Internet of Things (IIoT) data to be processed automatically to assist with either visualisation or autonomous system decision-making. However, the collection and processing of data in an industrial environment come with a cost. Little research has been undertaken on how to specify optimally what data to capture, transmit, process, and store at various levels of an edge-cloud network. The first step in this specification is to categorise IIoT data for efficient and effective use. This paper proposes the required attributes and classification to take manufacturing digital data from various sources to determine the most suitable location for data processing on the edge-cloud network. The proposed classification framework will minimise overhead in terms of network bandwidth/cost and processing time of machine tool data via efficient decision making on which dataset should be processed at the ‘edge’ and what to send to a remote server (cloud). A fast-and-frugal heuristic method is implemented for this decision-making. The framework is tested using case studies from industrial machine tools for machine productivity and maintenance.Keywords: data classification, decision making, edge computing, industrial IoT, industry 4.0
Procedia PDF Downloads 18224793 Denoising Transient Electromagnetic Data
Authors: Lingerew Nebere Kassie, Ping-Yu Chang, Hsin-Hua Huang, , Chaw-Son Chen
Abstract:
Transient electromagnetic (TEM) data plays a crucial role in hydrogeological and environmental applications, providing valuable insights into geological structures and resistivity variations. However, the presence of noise often hinders the interpretation and reliability of these data. Our study addresses this issue by utilizing a FASTSNAP system for the TEM survey, which operates at different modes (low, medium, and high) with continuous adjustments to discretization, gain, and current. We employ a denoising approach that processes the raw data obtained from each acquisition mode to improve signal quality and enhance data reliability. We use a signal-averaging technique for each mode, increasing the signal-to-noise ratio. Additionally, we utilize wavelet transform to suppress noise further while preserving the integrity of the underlying signals. This approach significantly improves the data quality, notably suppressing severe noise at late times. The resulting denoised data exhibits a substantially improved signal-to-noise ratio, leading to increased accuracy in parameter estimation. By effectively denoising TEM data, our study contributes to a more reliable interpretation and analysis of underground structures. Moreover, the proposed denoising approach can be seamlessly integrated into existing ground-based TEM data processing workflows, facilitating the extraction of meaningful information from noisy measurements and enhancing the overall quality and reliability of the acquired data.Keywords: data quality, signal averaging, transient electromagnetic, wavelet transform
Procedia PDF Downloads 8624792 The Potential Involvement of Platelet Indices in Insulin Resistance in Morbid Obese Children
Authors: Orkide Donma, Mustafa M. Donma
Abstract:
Association between insulin resistance (IR) and hematological parameters has long been a matter of interest. Within this context, body mass index (BMI), red blood cells, white blood cells and platelets were involved in this discussion. Parameters related to platelets associated with IR may be useful indicators for the identification of IR. Platelet indices such as mean platelet volume (MPV), platelet distribution width (PDW) and plateletcrit (PCT) are being questioned for their possible association with IR. The aim of this study was to investigate the association between platelet (PLT) count as well as PLT indices and the surrogate indices used to determine IR in morbid obese (MO) children. A total of 167 children participated in the study. Three groups were constituted. The number of cases was 34, 97 and 36 children in the normal BMI, MO and metabolic syndrome (MetS) groups, respectively. Sex- and age-dependent BMI-based percentile tables prepared by World Health Organization were used for the definition of morbid obesity. MetS criteria were determined. BMI values, homeostatic model assessment for IR (HOMA-IR), alanine transaminase-to-aspartate transaminase ratio (ALT/AST) and diagnostic obesity notation model assessment laboratory (DONMA-lab) index values were computed. PLT count and indices were analyzed using automated hematology analyzer. Data were collected for statistical analysis using SPSS for Windows. Arithmetic mean and standard deviation were calculated. Mean values of PLT-related parameters in both control and study groups were compared by one-way ANOVA followed by Tukey post hoc tests to determine whether a significant difference exists among the groups. The correlation analyses between PLT as well as IR indices were performed. Statistically significant difference was accepted as p-value < 0.05. Increased values were detected for PLT (p < 0.01) and PCT (p > 0.05) in MO group compared to those observed in children with N-BMI. Significant increases for PLT (p < 0.01) and PCT (p < 0.05) were observed in MetS group in comparison with the values obtained in children with N-BMI (p < 0.01). Significantly lower MPV and PDW values were obtained in MO group compared to the control group (p < 0.01). HOMA-IR (p < 0.05), DONMA-lab index (p < 0.001) and ALT/AST (p < 0.001) values in MO and MetS groups were significantly increased compared to the N-BMI group. On the other hand, DONMA-lab index values also differed between MO and MetS groups (p < 0.001). In the MO group, PLT was negatively correlated with MPV and PDW values. These correlations were not observed in the N-BMI group. None of the IR indices exhibited a correlation with PLT and PLT indices in the N-BMI group. HOMA-IR showed significant correlations both with PLT and PCT in the MO group. All of the three IR indices were well-correlated with each other in all groups. These findings point out the missing link between IR and PLT activation. In conclusion, PLT and PCT may be related to IR in addition to their identities as hemostasis markers during morbid obesity. Our findings have suggested that DONMA-lab index appears as the best surrogate marker for IR due to its discriminative feature between morbid obesity and MetS.Keywords: children, insulin resistance, metabolic syndrome, plateletcrit, platelet indices
Procedia PDF Downloads 10724791 An Embarrassingly Simple Semi-supervised Approach to Increase Recall in Online Shopping Domain to Match Structured Data with Unstructured Data
Authors: Sachin Nagargoje
Abstract:
Complete labeled data is often difficult to obtain in a practical scenario. Even if one manages to obtain the data, the quality of the data is always in question. In shopping vertical, offers are the input data, which is given by advertiser with or without a good quality of information. In this paper, an author investigated the possibility of using a very simple Semi-supervised learning approach to increase the recall of unhealthy offers (has badly written Offer Title or partial product details) in shopping vertical domain. The author found that the semisupervised learning method had improved the recall in the Smart Phone category by 30% on A=B testing on 10% traffic and increased the YoY (Year over Year) number of impressions per month by 33% at production. This also made a significant increase in Revenue, but that cannot be publicly disclosed.Keywords: semi-supervised learning, clustering, recall, coverage
Procedia PDF Downloads 12224790 Development of Microsatellite Markers for Genetic Variation Analysis in House Cricket, Acheta domesticus
Authors: Yash M. Gupta, Kittisak Buddhachat, Surin Peyachoknagul, Somjit Homchan
Abstract:
The house cricket, Acheta domesticus is one of the commonly found species of field crickets. Although it is very commonly used as food and feed, the genomic information of house cricket is still missing for genetic investigation. DNA sequencing technology has evolved over the decades, and it has also revolutionized the molecular marker development for genetic analysis. In the present study, we have sequenced the whole genome of A. domesticus using illumina platform based HiSeq X Ten sequencing technology for searching simple sequence repeats (SSRs) in DNA to develop polymorphic microsatellite markers for population genetic analysis. A total of 112,157 SSRs with primer pairs were identified, 91 randomly selected SSRs used to check DNA amplification, of which nine primers were polymorphic. These microsatellite markers have shown cross-amplification with other three species of crickets which are Gryllus bimaculatus, Gryllus testaceus and Brachytrupes portentosus. These nine polymorphic microsatellite markers were used to check genetic variation for forty-five individuals of A. domesticus, Phitsanulok population, Thailand. For nine loci, the number of alleles was ranging from 5 to 15. The observed heterozygosity was ranged from 0.4091 to 0.7556. These microsatellite markers will facilitate population genetic analysis for future studies of A. domesticus populations. Moreover, the transferability of these SSR makers would also enable researchers to conduct genetic studies for other closely related species.Keywords: cross-amplification, microsatellite markers, observed heterozygosity, population genetic, simple sequence repeats
Procedia PDF Downloads 14424789 Genodata: The Human Genome Variation Using BigData
Authors: Surabhi Maiti, Prajakta Tamhankar, Prachi Uttam Mehta
Abstract:
Since the accomplishment of the Human Genome Project, there has been an unparalled escalation in the sequencing of genomic data. This project has been the first major vault in the field of medical research, especially in genomics. This project won accolades by using a concept called Bigdata which was earlier, extensively used to gain value for business. Bigdata makes use of data sets which are generally in the form of files of size terabytes, petabytes, or exabytes and these data sets were traditionally used and managed using excel sheets and RDBMS. The voluminous data made the process tedious and time consuming and hence a stronger framework called Hadoop was introduced in the field of genetic sciences to make data processing faster and efficient. This paper focuses on using SPARK which is gaining momentum with the advancement of BigData technologies. Cloud Storage is an effective medium for storage of large data sets which is generated from the genetic research and the resultant sets produced from SPARK analysis.Keywords: human genome project, Bigdata, genomic data, SPARK, cloud storage, Hadoop
Procedia PDF Downloads 25924788 Ontology for a Voice Transcription of OpenStreetMap Data: The Case of Space Apprehension by Visually Impaired Persons
Authors: Said Boularouk, Didier Josselin, Eitan Altman
Abstract:
In this paper, we present a vocal ontology of OpenStreetMap data for the apprehension of space by visually impaired people. Indeed, the platform based on produsage gives a freedom to data producers to choose the descriptors of geocoded locations. Unfortunately, this freedom, called also folksonomy leads to complicate subsequent searches of data. We try to solve this issue in a simple but usable method to extract data from OSM databases in order to send them to visually impaired people using Text To Speech technology. We focus on how to help people suffering from visual disability to plan their itinerary, to comprehend a map by querying computer and getting information about surrounding environment in a mono-modal human-computer dialogue.Keywords: TTS, ontology, open street map, visually impaired
Procedia PDF Downloads 29724787 Design and Development of a Platform for Analyzing Spatio-Temporal Data from Wireless Sensor Networks
Authors: Walid Fantazi
Abstract:
The development of sensor technology (such as microelectromechanical systems (MEMS), wireless communications, embedded systems, distributed processing and wireless sensor applications) has contributed to a broad range of WSN applications which are capable of collecting a large amount of spatiotemporal data in real time. These systems require real-time data processing to manage storage in real time and query the data they process. In order to cover these needs, we propose in this paper a Snapshot spatiotemporal data model based on object-oriented concepts. This model allows saving storing and reducing data redundancy which makes it easier to execute spatiotemporal queries and save analyzes time. Further, to ensure the robustness of the system as well as the elimination of congestion from the main access memory we propose a spatiotemporal indexing technique in RAM called Captree *. As a result, we offer an RIA (Rich Internet Application) -based SOA application architecture which allows the remote monitoring and control.Keywords: WSN, indexing data, SOA, RIA, geographic information system
Procedia PDF Downloads 256