Search results for: real-time spatial big data
24894 Enhancing Healthcare Data Protection and Security
Authors: Joseph Udofia, Isaac Olufadewa
Abstract:
Everyday, the size of Electronic Health Records data keeps increasing as new patients visit health practitioner and returning patients fulfil their appointments. As these data grow, so is their susceptibility to cyber-attacks from criminals waiting to exploit this data. In the US, the damages for cyberattacks were estimated at $8 billion (2018), $11.5 billion (2019) and $20 billion (2021). These attacks usually involve the exposure of PII. Health data is considered PII, and its exposure carry significant impact. To this end, an enhancement of Health Policy and Standards in relation to data security, especially among patients and their clinical providers, is critical to ensure ethical practices, confidentiality, and trust in the healthcare system. As Clinical accelerators and applications that contain user data are used, it is expedient to have a review and revamp of policies like the Payment Card Industry Data Security Standard (PCI DSS), the Health Insurance Portability and Accountability Act (HIPAA), the Fast Healthcare Interoperability Resources (FHIR), all aimed to ensure data protection and security in healthcare. FHIR caters for healthcare data interoperability, FHIR caters to healthcare data interoperability, as data is being shared across different systems from customers to health insurance and care providers. The astronomical cost of implementation has deterred players in the space from ensuring compliance, leading to susceptibility to data exfiltration and data loss on the security accuracy of protected health information (PHI). Though HIPAA hones in on the security accuracy of protected health information (PHI) and PCI DSS on the security of payment card data, they intersect with the shared goal of protecting sensitive information in line with industry standards. With advancements in tech and the emergence of new technology, it is necessary to revamp these policies to address the complexity and ambiguity, cost barrier, and ever-increasing threats in cyberspace. Healthcare data in the wrong hands is a recipe for disaster, and we must enhance its protection and security to protect the mental health of the current and future generations.Keywords: cloud security, healthcare, cybersecurity, policy and standard
Procedia PDF Downloads 9624893 Channels Splitting Strategy for Optical Local Area Networks of Passive Star Topology
Authors: Peristera Baziana
Abstract:
In this paper, we present a network configuration for a WDM LANs of passive star topology that assume that the set of data WDM channels is split into two separate sets of channels, with different access rights over them. Especially, a synchronous transmission WDMA access algorithm is adopted in order to increase the probability of successful transmission over the data channels and consequently to reduce the probability of data packets transmission cancellation in order to avoid the data channels collisions. Thus, a control pre-transmission access scheme is followed over a separate control channel. An analytical Markovian model is studied and the average throughput is mathematically derived. The performance is studied for several numbers of data channels and various values of control phase duration.Keywords: access algorithm, channels division, collisions avoidance, wavelength division multiplexing
Procedia PDF Downloads 30124892 Kinematical Analysis of Normal Children in Different Age Groups during Gait
Authors: Nawaf Al Khashram, Graham Arnold, Weijie Wang
Abstract:
Background—Gait classifying allows clinicians to differentiate gait patterns into clinically important categories that help in clinical decision making. Reliable comparison of gait data between normal and patients requires knowledge of the gait parameters of normal children's specific age group. However, there is still a lack of the gait database for normal children of different ages. Objectives—The aim of this study is to investigate the kinematics of the lower limb joints during gait for normal children in different age groups. Methods—Fifty-three normal children (34 boys, 19 girls) were recruited in this study. All the children were aged between 5 to 16 years old. Age groups were defined as three types: young child aged (5-7), child (8-11), and adolescent (12-16). When a participant agreed to take part in the project, their parents signed a consent form. Vicon® motion capture system was used to collect gait data. Participants were asked to walk at their comfortable speed along a 10-meter walkway. Each participant walked up to 20 trials. Three good trials were analyzed using the Vicon Plug-in-Gait model to obtain parameters of the gait, e.g., walking speed, cadence, stride length, and joint parameters, e.g. joint angle, force, moments, etc. Moreover, each gait cycle was divided into 8 phases. The range of motion (ROM) angle of pelvis, hip, knee, and ankle joints in three planes of both limbs were calculated using an in-house program. Results—The temporal-spatial variables of three age groups of normal children were compared between each other; it was found that there was a significant difference (p < 0.05) between the groups. The step length and walking speed were gradually increasing from young child to adolescent, while cadence was gradually decreasing from young child to adolescent group. The mean and standard deviation (SD) of the step length of young child, child and adolescent groups were 0.502 ± 0.067 m, 0.566 ± 0.061 m and 0.672 ± 0.053 m, respectively. The mean and SD of the cadence of the young child, child and adolescent groups were 140.11±15.79 step/min, 129±11.84 step/min, and a 115.96±6.47 step/min, respectively. Moreover, it was observed that there were significant differences in kinematic parameters, either whole gait cycle or each phase. For example, RoM of knee angle in the sagittal plane in whole cycle of young child group is (65.03±0.52 deg) larger than child group (63.47±0.47 deg). Conclusion—Our result showed that there are significant differences between each age group in the gait phases and thus children walking performance changes with ages. Therefore, it is important for the clinician to consider age group when analyzing the patients with lower limb disorders before any clinical treatment.Keywords: age group, gait analysis, kinematics, normal children
Procedia PDF Downloads 12424891 Analyzing Tools and Techniques for Classification In Educational Data Mining: A Survey
Authors: D. I. George Amalarethinam, A. Emima
Abstract:
Educational Data Mining (EDM) is one of the newest topics to emerge in recent years, and it is concerned with developing methods for analyzing various types of data gathered from the educational circle. EDM methods and techniques with machine learning algorithms are used to extract meaningful and usable information from huge databases. For scientists and researchers, realistic applications of Machine Learning in the EDM sectors offer new frontiers and present new problems. One of the most important research areas in EDM is predicting student success. The prediction algorithms and techniques must be developed to forecast students' performance, which aids the tutor, institution to boost the level of student’s performance. This paper examines various classification techniques in prediction methods and data mining tools used in EDM.Keywords: classification technique, data mining, EDM methods, prediction methods
Procedia PDF Downloads 12124890 Improving Security in Healthcare Applications Using Federated Learning System With Blockchain Technology
Authors: Aofan Liu, Qianqian Tan, Burra Venkata Durga Kumar
Abstract:
Data security is of the utmost importance in the healthcare area, as sensitive patient information is constantly sent around and analyzed by many different parties. The use of federated learning, which enables data to be evaluated locally on devices rather than being transferred to a central server, has emerged as a potential solution for protecting the privacy of user information. To protect against data breaches and unauthorized access, federated learning alone might not be adequate. In this context, the application of blockchain technology could provide the system extra protection. This study proposes a distributed federated learning system that is built on blockchain technology in order to enhance security in healthcare. This makes it possible for a wide variety of healthcare providers to work together on data analysis without raising concerns about the confidentiality of the data. The technical aspects of the system, including as the design and implementation of distributed learning algorithms, consensus mechanisms, and smart contracts, are also investigated as part of this process. The technique that was offered is a workable alternative that addresses concerns about the safety of healthcare while also fostering collaborative research and the interchange of data.Keywords: data privacy, distributed system, federated learning, machine learning
Procedia PDF Downloads 13924889 A Concept of Data Mining with XML Document
Authors: Akshay Agrawal, Anand K. Srivastava
Abstract:
The increasing amount of XML datasets available to casual users increases the necessity of investigating techniques to extract knowledge from these data. Data mining is widely applied in the database research area in order to extract frequent correlations of values from both structured and semi-structured datasets. The increasing availability of heterogeneous XML sources has raised a number of issues concerning how to represent and manage these semi structured data. In recent years due to the importance of managing these resources and extracting knowledge from them, lots of methods have been proposed in order to represent and cluster them in different ways.Keywords: XML, similarity measure, clustering, cluster quality, semantic clustering
Procedia PDF Downloads 38824888 Speed-Up Data Transmission by Using Bluetooth Module on Gas Sensor Node of Arduino Board
Authors: Hiesik Kim, YongBeum Kim
Abstract:
Internet of Things (IoT) applications are widely serviced and spread worldwide. Local wireless data transmission technique must be developed to speed up with some technique. Bluetooth wireless data communication is wireless technique is technique made by Special Inter Group(SIG) using the frequency range 2.4 GHz, and it is exploiting Frequency Hopping to avoid collision with different device. To implement experiment, equipment for experiment transmitting measured data is made by using Arduino as Open source hardware, Gas sensor, and Bluetooth Module and algorithm controlling transmission speed is demonstrated. Experiment controlling transmission speed also is progressed by developing Android Application receiving measured data, and controlling this speed is available at the experiment result. it is important that in the future, improvement for communication algorithm be needed because few error occurs when data is transferred or received.Keywords: Arduino, Bluetooth, gas sensor, internet of things, transmission Speed
Procedia PDF Downloads 48524887 Evaluating the Total Costs of a Ransomware-Resilient Architecture for Healthcare Systems
Authors: Sreejith Gopinath, Aspen Olmsted
Abstract:
This paper is based on our previous work that proposed a risk-transference-based architecture for healthcare systems to store sensitive data outside the system boundary, rendering the system unattractive to would-be bad actors. This architecture also allows a compromised system to be abandoned and a new system instance spun up in place to ensure business continuity without paying a ransom or engaging with a bad actor. This paper delves into the details of various attacks we simulated against the prototype system. In the paper, we discuss at length the time and computational costs associated with storing and retrieving data in the prototype system, abandoning a compromised system, and setting up a new instance with existing data. Lastly, we simulate some analytical workloads over the data stored in our specialized data storage system and discuss the time and computational costs associated with running analytics over data in a specialized storage system outside the system boundary. In summary, this paper discusses the total costs of data storage, access, and analytics incurred with the proposed architecture.Keywords: cybersecurity, healthcare, ransomware, resilience, risk transference
Procedia PDF Downloads 13824886 A DEA Model in a Multi-Objective Optimization with Fuzzy Environment
Authors: Michael Gidey Gebru
Abstract:
Most DEA models operate in a static environment with input and output parameters that are chosen by deterministic data. However, due to ambiguity brought on shifting market conditions, input and output data are not always precisely gathered in real-world scenarios. Fuzzy numbers can be used to address this kind of ambiguity in input and output data. Therefore, this work aims to expand crisp DEA into DEA with fuzzy environment. In this study, the input and output data are regarded as fuzzy triangular numbers. Then, the DEA model with fuzzy environment is solved using a multi-objective method to gauge the Decision Making Units’ efficiency. Finally, the developed DEA model is illustrated with an application on real data 50 educational institutions.Keywords: efficiency, DEA, fuzzy, decision making units, higher education institutions
Procedia PDF Downloads 5824885 Land Cover, Land Surface Temperature, and Urban Heat Island Effects in Tropical Sub Saharan City of Accra
Authors: Eric Mensah
Abstract:
The effects of rapid urbanisation of tropical sub-Saharan developing cities on local and global climate are of great concern due to the negative impacts of Urban Heat Island (UHI) effects. The importance of urban parks, vegetative cover and forest reserves in these tropical cities have been undervalued with a rapid degradation and loss of these vegetative covers to urban developments which continue to cause an increase in daily mean temperatures and changes to local climatic conditions. Using Landsat data of the same months and period intervals, the spatial variations of land cover changes, temperature, and vegetation were examined to determine how vegetation improves local temperature and the effects of urbanisation on daily mean temperatures over the past 12 years. The remote sensing techniques of maximum likelihood supervised classification, land surface temperature retrieval technique, and normalised differential vegetation index techniques were used to analyse and create the land use land cover (LULC), land surface temperature (LST), and vegetation and non-vegetation cover maps respectively. Results from the study showed an increase in daily mean temperature by 0.80 °C as a result of rapid increase in urban area by 46.13 sq. km and loss of vegetative cover by 46.24 sq. km between 2005 and 2017. The LST map also shows the existence of UHI within the urban areas of Accra, the potential mitigating effects offered by the existence of forest and vegetative cover as demonstrated by the existence of cool islands around the Achimota ecological forest and University of Ghana botanical gardens areas.Keywords: land surface temperature, climate, remote sensing, urbanisation
Procedia PDF Downloads 32324884 Data-Driven Decision Making: Justification of Not Leaving Class without It
Authors: Denise Hexom, Judith Menoher
Abstract:
Teachers and administrators across America are being asked to use data and hard evidence to inform practice as they begin the task of implementing Common Core State Standards. Yet, the courses they are taking in schools of education are not preparing teachers or principals to understand the data-driven decision making (DDDM) process nor to utilize data in a much more sophisticated fashion. DDDM has been around for quite some time, however, it has only recently become systematically and consistently applied in the field of education. This paper discusses the theoretical framework of DDDM; empirical evidence supporting the effectiveness of DDDM; a process a department in a school of education has utilized to implement DDDM; and recommendations to other schools of education who attempt to implement DDDM in their decision-making processes and in their students’ coursework.Keywords: data-driven decision making, institute of higher education, special education, continuous improvement
Procedia PDF Downloads 39024883 Quantile Coherence Analysis: Application to Precipitation Data
Authors: Yaeji Lim, Hee-Seok Oh
Abstract:
The coherence analysis measures the linear time-invariant relationship between two data sets and has been studied various fields such as signal processing, engineering, and medical science. However classical coherence analysis tends to be sensitive to outliers and focuses only on mean relationship. In this paper, we generalized cross periodogram to quantile cross periodogram and provide richer inter-relationship between two data sets. This is a general version of Laplace cross periodogram. We prove its asymptotic distribution under the long range process and compare them with ordinary coherence through numerical examples. We also present real data example to confirm the usefulness of quantile coherence analysis.Keywords: coherence, cross periodogram, spectrum, quantile
Procedia PDF Downloads 39524882 Digital Metroliteracies: Space, Diversity and Identity
Authors: Sender Dovchin, Alastair Pennycook
Abstract:
This paper looks at the relationship between online space, urban space and digital literacies. The everyday digital literacy practices of Facebook users (with a particular focus on young urban Mongolians) can be understood as ‘metrolingual’ because of the varied ways in which linguistic and cultural resources, spatial repertoires, and online activities are bound together to make meaning. Whereas the initial development of the term metrolingualism was dependent on a notion of physical urban space, we here argue that the digital practices of these Facebook users perform a range of social and cultural identities (sexual, ethnic, and class-based identities) that are both parts of but also adjacent to the metrolingual fabric.Keywords: metrolingualism, digital literacy, Mongolia, Facebook
Procedia PDF Downloads 23024881 Conception of a Predictive Maintenance System for Forest Harvesters from Multiple Data Sources
Authors: Lazlo Fauth, Andreas Ligocki
Abstract:
For cost-effective use of harvesters, expensive repairs and unplanned downtimes must be reduced as far as possible. The predictive detection of failing systems and the calculation of intelligent service intervals, necessary to avoid these factors, require in-depth knowledge of the machines' behavior. Such know-how needs permanent monitoring of the machine state from different technical perspectives. In this paper, three approaches will be presented as they are currently pursued in the publicly funded project PreForst at Ostfalia University of Applied Sciences. These include the intelligent linking of workshop and service data, sensors on the harvester, and a special online hydraulic oil condition monitoring system. Furthermore the paper shows potentials as well as challenges for the use of these data in the conception of a predictive maintenance system.Keywords: predictive maintenance, condition monitoring, forest harvesting, forest engineering, oil data, hydraulic data
Procedia PDF Downloads 15324880 Sampled-Data Control for Fuel Cell Systems
Authors: H. Y. Jung, Ju H. Park, S. M. Lee
Abstract:
A sampled-data controller is presented for solid oxide fuel cell systems which is expressed by a sector bounded nonlinear model. The sector bounded nonlinear systems, which have a feedback connection with a linear dynamical system and nonlinearity satisfying certain sector type constraints. Also, the sampled-data control scheme is very useful since it is possible to handle digital controller and increasing research efforts have been devoted to sampled-data control systems with the development of modern high-speed computers. The proposed control law is obtained by solving a convex problem satisfying several linear matrix inequalities. Simulation results are given to show the effectiveness of the proposed design method.Keywords: sampled-data control, fuel cell, linear matrix inequalities, nonlinear control
Procedia PDF Downloads 56824879 How Western Donors Allocate Official Development Assistance: New Evidence From a Natural Language Processing Approach
Authors: Daniel Benson, Yundan Gong, Hannah Kirk
Abstract:
Advancement in national language processing techniques has led to increased data processing speeds, and reduced the need for cumbersome, manual data processing that is often required when processing data from multilateral organizations for specific purposes. As such, using named entity recognition (NER) modeling and the Organisation of Economically Developed Countries (OECD) Creditor Reporting System database, we present the first geotagged dataset of OECD donor Official Development Assistance (ODA) projects on a global, subnational basis. Our resulting data contains 52,086 ODA projects geocoded to subnational locations across 115 countries, worth a combined $87.9bn. This represents the first global, OECD donor ODA project database with geocoded projects. We use this new data to revisit old questions of how ‘well’ donors allocate ODA to the developing world. This understanding is imperative for policymakers seeking to improve ODA effectiveness.Keywords: international aid, geocoding, subnational data, natural language processing, machine learning
Procedia PDF Downloads 8524878 Modeling Soil Erosion and Sediment Yield in Geba Catchment, Ethiopia
Authors: Gebremedhin Kiros, Amba Shetty, Lakshman Nandagiri
Abstract:
Soil erosion is a major threat to the sustainability of land and water resources in the catchment and there is a need to identify critical areas of erosion so that suitable conservation measures may be adopted. The present study was taken up to understand the temporal and spatial distribution of soil erosion and daily sediment yield in Geba catchment (5137 km2) located in the Northern Highlands of Ethiopia. Soil and Water Assessment Tool (SWAT) was applied to the Geba catchment using data pertaining to rainfall, climate, soils, topography and land use/land cover (LU/LC) for the historical period 2000-2013. LU/LC distribution in the catchment was characterized using LANDSAT satellite imagery and the GIS-based ArcSWAT version of the model. The model was calibrated and validated using sediment concentration measurements made at the catchment outlet. The catchment was divided into 13 sub-basins and based on estimated soil erosion, these were prioritized on the basis of susceptibility to soil erosion. Model results indicated that the average sediment yield estimated of the catchment was 12.23 tons/ha/yr. The generated soil loss map indicated that a large portion of the catchment has high erosion rates resulting in significantly large sediment yield at the outlet. Steep and unstable terrain, the occurrence of highly erodible soils and low vegetation cover appeared to favor high soil erosion. Results obtained from this study prove useful in adopting in targeted soil and water conservation measures and promote sustainable management of natural resources in the Geba and similar catchments in the region.Keywords: Ethiopia, Geba catchment, MUSLE, sediment yield, SWAT Model
Procedia PDF Downloads 31824877 Compressed Suffix Arrays to Self-Indexes Based on Partitioned Elias-Fano
Abstract:
A practical and simple self-indexing data structure, Partitioned Elias-Fano (PEF) - Compressed Suffix Arrays (CSA), is built in linear time for the CSA based on PEF indexes. Moreover, the PEF-CSA is compared with two classical compressed indexing methods, Ferragina and Manzini implementation (FMI) and Sad-CSA on different type and size files in Pizza & Chili. The PEF-CSA performs better on the existing data in terms of the compression ratio, count, and locates time except for the evenly distributed data such as proteins data. The observations of the experiments are that the distribution of the φ is more important than the alphabet size on the compression ratio. Unevenly distributed data φ makes better compression effect, and the larger the size of the hit counts, the longer the count and locate time.Keywords: compressed suffix array, self-indexing, partitioned Elias-Fano, PEF-CSA
Procedia PDF Downloads 25424876 Data, Digital Identity and Antitrust Law: An Exploratory Study of Facebook’s Novi Digital Wallet
Authors: Wanjiku Karanja
Abstract:
Facebook has monopoly power in the social networking market. It has grown and entrenched its monopoly power through the capture of its users’ data value chains. However, antitrust law’s consumer welfare roots have prevented it from effectively addressing the role of data capture in Facebook’s market dominance. These regulatory blind spots are augmented in Facebook’s proposed Diem cryptocurrency project and its Novi Digital wallet. Novi, which is Diem’s digital identity component, shall enable Facebook to collect an unprecedented volume of consumer data. Consequently, Novi has seismic implications on internet identity as the network effects of Facebook’s large user base could establish it as the de facto internet identity layer. Moreover, the large tracts of data Facebook shall collect through Novi shall further entrench Facebook's market power. As such, the attendant lock-in effects of this project shall be very difficult to reverse. Urgent regulatory action is therefore required to prevent this expansion of Facebook’s data resources and monopoly power. This research thus highlights the importance of data capture to competition and market health in the social networking industry. It utilizes interviews with key experts to empirically interrogate the impact of Facebook’s data capture and control of its users’ data value chains on its market power. This inquiry is contextualized against Novi’s expansive effect on Facebook’s data value chains. It thus addresses the novel antitrust issues arising at the nexus of Facebook’s monopoly power and the privacy of its users’ data. It also explores the impact of platform design principles, specifically data portability and data portability, in mitigating Facebook’s anti-competitive practices. As such, this study finds that Facebook is a powerful monopoly that dominates the social media industry to the detriment of potential competitors. Facebook derives its power from its size, annexure of the consumer data value chain, and control of its users’ social graphs. Additionally, the platform design principles of data interoperability and data portability are not a panacea to restoring competition in the social networking market. Their success depends on the establishment of robust technical standards and regulatory frameworks.Keywords: antitrust law, data protection law, data portability, data interoperability, digital identity, Facebook
Procedia PDF Downloads 12524875 Unlocking New Room of Production in Brown Field; Integration of Geological Data Conditioned 3D Reservoir Modelling of Lower Senonian Matulla Formation, RAS Budran Field, East Central Gulf of Suez, Egypt
Authors: Nader Mohamed
Abstract:
The Late Cretaceous deposits are well developed through-out Egypt. This is due to a transgression phase associated with the subsidence caused by the neo-Tethyan rift event that took place across the northern margin of Africa, resulting in a period of dominantly marine deposits in the Gulf of Suez. The Late Cretaceous Nezzazat Group represents the Cenomanian, Turonian and clastic sediments of the Lower Senonian. The Nezzazat Group has been divided into four formations namely, from base to top, the Raha Formation, the Abu Qada Formation, the Wata Formation and the Matulla Formation. The Cenomanian Raha and the Lower Senonian Matulla formations are the most important clastic sequence in the Nezzazat Group because they provide the highest net reservoir thickness and the highest net/gross ratio. This study emphasis on Matulla formation located in the eastern part of the Gulf of Suez. The three stratigraphic surface sections (Wadi Sudr, Wadi Matulla and Gabal Nezzazat) which represent the exposed Coniacian-Santonian sediments in Sinai are used for correlating Matulla sediments of Ras Budran field. Cutting description, petrographic examination, log behaviors, biostratigraphy with outcrops are used to identify the reservoir characteristics, lithology, facies environment logs and subdivide the Matulla formation into three units. The lower unit is believed to be the main reservoir where it consists mainly of sands with shale and sandy carbonates, while the other units are mainly carbonate with some streaks of shale and sand. Reservoir modeling is an effective technique that assists in reservoir management as decisions concerning development and depletion of hydrocarbon reserves, So It was essential to model the Matulla reservoir as accurately as possible in order to better evaluate, calculate the reserves and to determine the most effective way of recovering as much of the petroleum economically as possible. All available data on Matulla formation are used to build the reservoir structure model, lithofacies, porosity, permeability and water saturation models which are the main parameters that describe the reservoirs and provide information on effective evaluation of the need to develop the oil potentiality of the reservoir. This study has shown the effectiveness of; 1) the integration of geological data to evaluate and subdivide Matulla formation into three units. 2) Lithology and facies environment interpretation which helped in defining the nature of deposition of Matulla formation. 3) The 3D reservoir modeling technology as a tool for adequate understanding of the spatial distribution of property and in addition evaluating the unlocked new reservoir areas of Matulla formation which have to be drilled to investigate and exploit the un-drained oil. 4) This study led to adding a new room of production and additional reserves to Ras Budran field. Keywords: geology, oil and gas, geoscience, sequence stratigraphy
Procedia PDF Downloads 10824874 Recommendations for Data Quality Filtering of Opportunistic Species Occurrence Data
Authors: Camille Van Eupen, Dirk Maes, Marc Herremans, Kristijn R. R. Swinnen, Ben Somers, Stijn Luca
Abstract:
In ecology, species distribution models are commonly implemented to study species-environment relationships. These models increasingly rely on opportunistic citizen science data when high-quality species records collected through standardized recording protocols are unavailable. While these opportunistic data are abundant, uncertainty is usually high, e.g., due to observer effects or a lack of metadata. Data quality filtering is often used to reduce these types of uncertainty in an attempt to increase the value of studies relying on opportunistic data. However, filtering should not be performed blindly. In this study, recommendations are built for data quality filtering of opportunistic species occurrence data that are used as input for species distribution models. Using an extensive database of 5.7 million citizen science records from 255 species in Flanders, the impact on model performance was quantified by applying three data quality filters, and these results were linked to species traits. More specifically, presence records were filtered based on record attributes that provide information on the observation process or post-entry data validation, and changes in the area under the receiver operating characteristic (AUC), sensitivity, and specificity were analyzed using the Maxent algorithm with and without filtering. Controlling for sample size enabled us to study the combined impact of data quality filtering, i.e., the simultaneous impact of an increase in data quality and a decrease in sample size. Further, the variation among species in their response to data quality filtering was explored by clustering species based on four traits often related to data quality: commonness, popularity, difficulty, and body size. Findings show that model performance is affected by i) the quality of the filtered data, ii) the proportional reduction in sample size caused by filtering and the remaining absolute sample size, and iii) a species ‘quality profile’, resulting from a species classification based on the four traits related to data quality. The findings resulted in recommendations on when and how to filter volunteer generated and opportunistically collected data. This study confirms that correctly processed citizen science data can make a valuable contribution to ecological research and species conservation.Keywords: citizen science, data quality filtering, species distribution models, trait profiles
Procedia PDF Downloads 20924873 Data Quality Enhancement with String Length Distribution
Authors: Qi Xiu, Hiromu Hota, Yohsuke Ishii, Takuya Oda
Abstract:
Recently, collectable manufacturing data are rapidly increasing. On the other hand, mega recall is getting serious as a social problem. Under such circumstances, there are increasing needs for preventing mega recalls by defect analysis such as root cause analysis and abnormal detection utilizing manufacturing data. However, the time to classify strings in manufacturing data by traditional method is too long to meet requirement of quick defect analysis. Therefore, we present String Length Distribution Classification method (SLDC) to correctly classify strings in a short time. This method learns character features, especially string length distribution from Product ID, Machine ID in BOM and asset list. By applying the proposal to strings in actual manufacturing data, we verified that the classification time of strings can be reduced by 80%. As a result, it can be estimated that the requirement of quick defect analysis can be fulfilled.Keywords: string classification, data quality, feature selection, probability distribution, string length
Procedia PDF Downloads 32124872 Determining Abnomal Behaviors in UAV Robots for Trajectory Control in Teleoperation
Authors: Kiwon Yeom
Abstract:
Change points are abrupt variations in a data sequence. Detection of change points is useful in modeling, analyzing, and predicting time series in application areas such as robotics and teleoperation. In this paper, a change point is defined to be a discontinuity in one of its derivatives. This paper presents a reliable method for detecting discontinuities within a three-dimensional trajectory data. The problem of determining one or more discontinuities is considered in regular and irregular trajectory data from teleoperation. We examine the geometric detection algorithm and illustrate the use of the method on real data examples.Keywords: change point, discontinuity, teleoperation, abrupt variation
Procedia PDF Downloads 17024871 Dynamic Programming Based Algorithm for the Unit Commitment of the Transmission-Constrained Multi-Site Combined Heat and Power System
Authors: A. Rong, P. B. Luh, R. Lahdelma
Abstract:
High penetration of intermittent renewable energy sources (RES) such as solar power and wind power into the energy system has caused temporal and spatial imbalance between electric power supply and demand for some countries and regions. This brings about the critical need for coordinating power production and power exchange for different regions. As compared with the power-only systems, the combined heat and power (CHP) systems can provide additional flexibility of utilizing RES by exploiting the interdependence of power and heat production in the CHP plant. In the CHP system, power production can be influenced by adjusting heat production level and electric power can be used to satisfy heat demand by electric boiler or heat pump in conjunction with heat storage, which is much cheaper than electric storage. This paper addresses multi-site CHP systems without considering RES, which lay foundation for handling penetration of RES. The problem under study is the unit commitment (UC) of the transmission-constrained multi-site CHP systems. We solve the problem by combining linear relaxation of ON/OFF states and sequential dynamic programming (DP) techniques, where relaxed states are used to reduce the dimension of the UC problem and DP for improving the solution quality. Numerical results for daily scheduling with realistic models and data show that DP-based algorithm is from a few to a few hundred times faster than CPLEX (standard commercial optimization software) with good solution accuracy (less than 1% relative gap from the optimal solution on the average).Keywords: dynamic programming, multi-site combined heat and power system, relaxed states, transmission-constrained generation unit commitment
Procedia PDF Downloads 36924870 Multidimensional Item Response Theory Models for Practical Application in Large Tests Designed to Measure Multiple Constructs
Authors: Maria Fernanda Ordoñez Martinez, Alvaro Mauricio Montenegro
Abstract:
This work presents a statistical methodology for measuring and founding constructs in Latent Semantic Analysis. This approach uses the qualities of Factor Analysis in binary data with interpretations present on Item Response Theory. More precisely, we propose initially reducing dimensionality with specific use of Principal Component Analysis for the linguistic data and then, producing axes of groups made from a clustering analysis of the semantic data. This approach allows the user to give meaning to previous clusters and found the real latent structure presented by data. The methodology is applied in a set of real semantic data presenting impressive results for the coherence, speed and precision.Keywords: semantic analysis, factorial analysis, dimension reduction, penalized logistic regression
Procedia PDF Downloads 44924869 Analysis of Production Forecasting in Unconventional Gas Resources Development Using Machine Learning and Data-Driven Approach
Authors: Dongkwon Han, Sangho Kim, Sunil Kwon
Abstract:
Unconventional gas resources have dramatically changed the future energy landscape. Unlike conventional gas resources, the key challenges in unconventional gas have been the requirement that applies to advanced approaches for production forecasting due to uncertainty and complexity of fluid flow. In this study, artificial neural network (ANN) model which integrates machine learning and data-driven approach was developed to predict productivity in shale gas. The database of 129 wells of Eagle Ford shale basin used for testing and training of the ANN model. The Input data related to hydraulic fracturing, well completion and productivity of shale gas were selected and the output data is a cumulative production. The performance of the ANN using all data sets, clustering and variables importance (VI) models were compared in the mean absolute percentage error (MAPE). ANN model using all data sets, clustering, and VI were obtained as 44.22%, 10.08% (cluster 1), 5.26% (cluster 2), 6.35%(cluster 3), and 32.23% (ANN VI), 23.19% (SVM VI), respectively. The results showed that the pre-trained ANN model provides more accurate results than the ANN model using all data sets.Keywords: unconventional gas, artificial neural network, machine learning, clustering, variables importance
Procedia PDF Downloads 19924868 Procedure Model for Data-Driven Decision Support Regarding the Integration of Renewable Energies into Industrial Energy Management
Authors: M. Graus, K. Westhoff, X. Xu
Abstract:
The climate change causes a change in all aspects of society. While the expansion of renewable energies proceeds, industry could not be convinced based on general studies about the potential of demand side management to reinforce smart grid considerations in their operational business. In this article, a procedure model for a case-specific data-driven decision support for industrial energy management based on a holistic data analytics approach is presented. The model is executed on the example of the strategic decision problem, to integrate the aspect of renewable energies into industrial energy management. This question is induced due to considerations of changing the electricity contract model from a standard rate to volatile energy prices corresponding to the energy spot market which is increasingly more affected by renewable energies. The procedure model corresponds to a data analytics process consisting on a data model, analysis, simulation and optimization step. This procedure will help to quantify the potentials of sustainable production concepts based on the data from a factory. The model is validated with data from a printer in analogy to a simple production machine. The overall goal is to establish smart grid principles for industry via the transformation from knowledge-driven to data-driven decisions within manufacturing companies.Keywords: data analytics, green production, industrial energy management, optimization, renewable energies, simulation
Procedia PDF Downloads 43924867 Dissimilarity-Based Coloring for Symbolic and Multivariate Data Visualization
Authors: K. Umbleja, M. Ichino, H. Yaguchi
Abstract:
In this paper, we propose a coloring method for multivariate data visualization by using parallel coordinates based on dissimilarity and tree structure information gathered during hierarchical clustering. The proposed method is an extension for proximity-based coloring that suffers from a few undesired side effects if hierarchical tree structure is not balanced tree. We describe the algorithm by assigning colors based on dissimilarity information, show the application of proposed method on three commonly used datasets, and compare the results with proximity-based coloring. We found our proposed method to be especially beneficial for symbolic data visualization where many individual objects have already been aggregated into a single symbolic object.Keywords: data visualization, dissimilarity-based coloring, proximity-based coloring, symbolic data
Procedia PDF Downloads 17324866 Geothermal Resources to Ensure Energy Security During Climate Change
Authors: Debasmita Misra, Arthur Nash
Abstract:
Energy security and sufficiency enables the economic development and welfare of a nation or a society. Currently, the global energy system is dominated by fossil fuels, which is a non-renewable energy resource, which renders vulnerability to energy security. Hence, many nations have begun augmenting their energy system with renewable energy resources, such as solar, wind, biomass and hydro. However, with climate change, how sustainable are some of the renewable energy resources in the future is a matter of concern. Geothermal energy resources have been underexplored or underexploited in global renewable energy production and security, although it is gaining attractiveness as a renewable energy resource. The question is, whether geothermal energy resources are more sustainable than other renewable energy resources. High-temperature reservoirs (> 220 °F) can produce electricity from flash/dry steam plants as well as binary cycle production facilities. Most of the world’s high enthalpy geothermal resources are within the seismo-tectonic belt. However, exploration for geothermal energy is of great importance in conventional geothermal systems in order to improve its economic viability. In recent years, there has been an increase in the use and development of several exploration methods for geo-thermal resources, such as seismic or electromagnetic methods. The thermal infrared band of the Landsat can reflect land surface temperature difference, so the ETM+ data with specific grey stretch enhancement has been used to explore underground heat water. Another way of exploring for potential power is utilizing fairway play analysis for sites without surface expression and in rift zones. Utilizing this type of analysis can improve the success rate of project development by reducing exploration costs. Identifying the basin distribution of geologic factors that control the geothermal environment would help in identifying the control of resource concentration aside from the heat flow, thus improving the probability of success. The first step is compiling existing geophysical data. This leads to constructing conceptual models of potential geothermal concentrations which can then be utilized in creating a geodatabase to analyze risk maps. Geospatial analysis and other GIS tools can be used in such efforts to produce spatial distribution maps. The goal of this paper is to discuss how climate change may impact renewable energy resources and how could a synthesized analysis be developed for geothermal resources to ensure sustainable and cost effective exploitation of the resource.Keywords: exploration, geothermal, renewable energy, sustainable
Procedia PDF Downloads 15624865 Flood Inundation Mapping at Wuseta River, East Gojjam Zone, Amhara Regional State, Ethiopia
Authors: Arega Mulu
Abstract:
Flood is a usual phenomenon that will continue to be a leading risk as extensive as societies living and effort in flood-disposed areas. It happens when the size of rainwater in a stream surpasses the volume of the canal. In Ethiopia, municipal overflow events are suitable for severe difficulty in current years. This overflow is mainly related to poorly planned city drainage schemes and land use design. Collective with it, the absence of detailed flood levels, the absence of an early caution scheme and systematized flood catastrophe alleviation actions at countrywide and local levels further raise the gravity of the problem. Hence, this study produces flood inundation maps in the Wuseta River using HEC-GeoRAS and HEC-RAS models. The flooded areas along the Wuseta River have been plotted based on different return periods. The highest flows for various return periods were assessed using the HEC-RAS model, GIS for spatial data processing, and HEC-GeoRAS for interfacing among HEC-RAS and GIS. The areas along the Wuseta River simulated to be flooded for 5, 10, 25, 50, and 100-year return periods. For a 100-year return period flood frequency, the maximum flood depth was 2.26m, and the maximum width was 0.3km on each riverside. This maximum Depth of flood was extended from near to the journey from the university to Debre Markos Town. Most of the area was affected near the Wuseta market to Abaykunu new bridge, and a small portion was affected from Abaykunu to the road crossing from Addis Ababa to Debre Markos Town. The outcome of this study will help the concerned bodies frame and advance policies according to the existing flood risk in the area.Keywords: flood innundation, wuseta river, HEC-HMS, HEC-RAS
Procedia PDF Downloads 13