Search results for: data mining analytics
23959 Web Search Engine Based Naming Procedure for Independent Topic
Authors: Takahiro Nishigaki, Takashi Onoda
Abstract:
In recent years, the number of document data has been increasing since the spread of the Internet. Many methods have been studied for extracting topics from large document data. We proposed Independent Topic Analysis (ITA) to extract topics independent of each other from large document data such as newspaper data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis. The topic represented by ITA is represented by a set of words. However, the set of words is quite different from the topics the user imagines. For example, the top five words with high independence of a topic are as follows. Topic1 = {"scor", "game", "lead", "quarter", "rebound"}. This Topic 1 is considered to represent the topic of "SPORTS". This topic name "SPORTS" has to be attached by the user. ITA cannot name topics. Therefore, in this research, we propose a method to obtain topics easy for people to understand by using the web search engine, topics given by the set of words given by independent topic analysis. In particular, we search a set of topical words, and the title of the homepage of the search result is taken as the topic name. And we also use the proposed method for some data and verify its effectiveness.Keywords: independent topic analysis, topic extraction, topic naming, web search engine
Procedia PDF Downloads 11923958 Extracting Terrain Points from Airborne Laser Scanning Data in Densely Forested Areas
Authors: Ziad Abdeldayem, Jakub Markiewicz, Kunal Kansara, Laura Edwards
Abstract:
Airborne Laser Scanning (ALS) is one of the main technologies for generating high-resolution digital terrain models (DTMs). DTMs are crucial to several applications, such as topographic mapping, flood zone delineation, geographic information systems (GIS), hydrological modelling, spatial analysis, etc. Laser scanning system generates irregularly spaced three-dimensional cloud of points. Raw ALS data are mainly ground points (that represent the bare earth) and non-ground points (that represent buildings, trees, cars, etc.). Removing all the non-ground points from the raw data is referred to as filtering. Filtering heavily forested areas is considered a difficult and challenging task as the canopy stops laser pulses from reaching the terrain surface. This research presents an approach for removing non-ground points from raw ALS data in densely forested areas. Smoothing splines are exploited to interpolate and fit the noisy ALS data. The presented filter utilizes a weight function to allocate weights for each point of the data. Furthermore, unlike most of the methods, the presented filtering algorithm is designed to be automatic. Three different forested areas in the United Kingdom are used to assess the performance of the algorithm. The results show that the generated DTMs from the filtered data are accurate (when compared against reference terrain data) and the performance of the method is stable for all the heavily forested data samples. The average root mean square error (RMSE) value is 0.35 m.Keywords: airborne laser scanning, digital terrain models, filtering, forested areas
Procedia PDF Downloads 13923957 Estimating the Life-Distribution Parameters of Weibull-Life PV Systems Utilizing Non-Parametric Analysis
Authors: Saleem Z. Ramadan
Abstract:
In this paper, a model is proposed to determine the life distribution parameters of the useful life region for the PV system utilizing a combination of non-parametric and linear regression analysis for the failure data of these systems. Results showed that this method is dependable for analyzing failure time data for such reliable systems when the data is scarce.Keywords: masking, bathtub model, reliability, non-parametric analysis, useful life
Procedia PDF Downloads 56223956 Improvement of Overall Equipment Effectiveness of Load Haul Dump Machines in Underground Coal Mines
Authors: J. BalaRaju, M. Govinda Raj, C. S. N. Murthy
Abstract:
Every organization in the competitive world tends to improve its economy by increasing their production and productivity rates. Unequivocally, the production in Indian underground mines over the years is not satisfactory, due to a variety of reasons. There are manifold of avenues for the betterment of production, and one such approach is through enhanced utilization of mechanized equipment such as Load Haul Dumper (LHD). This is used as loading and hauling purpose in underground mines. In view of the aforementioned facts, this paper delves into identification of the key influencing factors such as LHDs maintenance effectiveness, vehicle condition, operator skill and utilization of the machines on performance of LHDs. An attempt has been made for improvement of performance of the equipment through evaluation of Overall Equipment Effectiveness (OEE). Two different approaches for evaluation of OEE have been adopted and compared under various operating conditions. The use of OEE calculation in terms of percentage availability, performance and quality and the hitherto existing situation of the underground mine production is evaluated. Necessary recommendations are suggested to mining industry on the basis of OEE.Keywords: utilization, maintenance, availability, performance and quality
Procedia PDF Downloads 22223955 Preliminary Design of Maritime Energy Management System: Naval Architectural Approach to Resolve Recent Limitations
Authors: Seyong Jeong, Jinmo Park, Jinhyoun Park, Boram Kim, Kyoungsoo Ahn
Abstract:
Energy management in the maritime industry is being required by economics and in conformity with new legislative actions taken by the International Maritime Organization (IMO) and the European Union (EU). In response, the various performance monitoring methodologies and data collection practices have been examined by different stakeholders. While many assorted advancements in operation and technology are applicable, their adoption in the shipping industry stays small. This slow uptake can be considered due to many different barriers such as data analysis problems, misreported data, and feedback problems, etc. This study presents a conceptual design of an energy management system (EMS) and proposes the methodology to resolve the limitations (e.g., data normalization using naval architectural evaluation, management of misrepresented data, and feedback from shore to ship through management of performance analysis history). We expect this system to make even short-term charterers assess the ship performance properly and implement sustainable fleet control.Keywords: data normalization, energy management system, naval architectural evaluation, ship performance analysis
Procedia PDF Downloads 44923954 Investigating of the Fuel Consumption in Construction Machinery and Ways to Reduce Fuel Consumption
Authors: Reza Bahboodian
Abstract:
One of the most important factors in the use of construction machinery is the fuel consumption cost of this equipment. The use of diesel engines in off-road vehicles is an important source of nitrogen oxides and particulate matter. Emissions of nitrogen oxides and particulate matter 10 in off-road vehicles (construction and mining) may be high. Due to the high cost of fuel, it is necessary to minimize fuel consumption. Factors affecting the fuel consumption of these cars are very diverse. Climate changes such as changes in pressure, temperature, humidity, fuel type selection, type of gearbox used in the car are effective in fuel consumption and pollution, and engine efficiency. In this paper, methods for reducing fuel consumption and pollutants by considering valid European and European standards are examined based on new methods such as hybridization, optimal gear change, adding hydrogen to diesel fuel, determining optimal working fluids, and using oxidation catalysts.Keywords: improve fuel consumption, construction machinery, pollutant reduction, determining the optimal working cycle
Procedia PDF Downloads 16123953 Geospatial Data Complexity in Electronic Airport Layout Plan
Authors: Shyam Parhi
Abstract:
Airports GIS program collects Airports data, validate and verify it, and stores it in specific database. Airports GIS allows authorized users to submit changes to airport data. The verified data is used to develop several engineering applications. One of these applications is electronic Airport Layout Plan (eALP) whose primary aim is to move from paper to digital form of ALP. The first phase of development of eALP was completed recently and it was tested for a few pilot program airports across different regions. We conducted gap analysis and noticed that a lot of development work is needed to fine tune at least six mandatory sheets of eALP. It is important to note that significant amount of programming is needed to move from out-of-box ArcGIS to a much customized ArcGIS which will be discussed. The ArcGIS viewer capability to display essential features like runway or taxiway or the perpendicular distance between them will be discussed. An enterprise level workflow which incorporates coordination process among different lines of business will be highlighted.Keywords: geospatial data, geology, geographic information systems, aviation
Procedia PDF Downloads 41623952 Belt Conveyor Dynamics in Transient Operation for Speed Control
Authors: D. He, Y. Pang, G. Lodewijks
Abstract:
Belt conveyors play an important role in continuous dry bulk material transport, especially at the mining industry. Speed control is expected to reduce the energy consumption of belt conveyors. Transient operation is the operation of increasing or decreasing conveyor speed for speed control. According to literature review, current research rarely takes the conveyor dynamics in transient operation into account. However, in belt conveyor speed control, the conveyor dynamic behaviors are significantly important since the poor dynamics might result in risks. In this paper, the potential risks in transient operation will be analyzed. An existing finite element model will be applied to build a conveyor model, and simulations will be carried out to analyze the conveyor dynamics. In order to realize the soft speed regulation, Harrison’s sinusoid acceleration profile will be applied, and Lodewijks estimator will be built to approximate the required acceleration time. A long inclined belt conveyor will be studied with two major simulations. The conveyor dynamics will be given.Keywords: belt conveyor , speed control, transient operation, dynamics
Procedia PDF Downloads 33123951 Anisotropic Total Fractional Order Variation Model in Seismic Data Denoising
Authors: Jianwei Ma, Diriba Gemechu
Abstract:
In seismic data processing, attenuation of random noise is the basic step to improve quality of data for further application of seismic data in exploration and development in different gas and oil industries. The signal-to-noise ratio of the data also highly determines quality of seismic data. This factor affects the reliability as well as the accuracy of seismic signal during interpretation for different purposes in different companies. To use seismic data for further application and interpretation, we need to improve the signal-to-noise ration while attenuating random noise effectively. To improve the signal-to-noise ration and attenuating seismic random noise by preserving important features and information about seismic signals, we introduce the concept of anisotropic total fractional order denoising algorithm. The anisotropic total fractional order variation model defined in fractional order bounded variation is proposed as a regularization in seismic denoising. The split Bregman algorithm is employed to solve the minimization problem of the anisotropic total fractional order variation model and the corresponding denoising algorithm for the proposed method is derived. We test the effectiveness of theproposed method for synthetic and real seismic data sets and the denoised result is compared with F-X deconvolution and non-local means denoising algorithm.Keywords: anisotropic total fractional order variation, fractional order bounded variation, seismic random noise attenuation, split Bregman algorithm
Procedia PDF Downloads 20723950 Circular Bio-economy of Copper and Gold from Electronic Wastes
Authors: Sadia Ilyas, Hyunjung Kim, Rajiv R. Srivastava
Abstract:
Current work has attempted to establish the linkages between circular bio-economy and recycling of copper and gold from urban mine by applying microbial activities instead of the smelter and chemical technologies. Thereafter, based on the potential of microbial approaches and research hypothesis, the structural model has been tested for a significance level of 99%, which is supported by the corresponding standardization co-efficient values. A prediction model applied to determine the recycling impact on circular bio-economy indicates to re-circulate 51,833 tons of copper and 58 tons of gold by 2030 for the production of virgin metals/raw-materials, while recycling rate of the accumulated e-waste remains to be 20%. This restoration volume of copper and gold through the microbial activities corresponds to mitigate 174 million kg CO₂ emissions and 24 million m³ water consumption if compared with the primary production activities. The study potentially opens a new window for environmentally-friendly biotechnological recycling of e-waste urban mine under the umbrella concept of circular bio-economy.Keywords: urban mining, biobleaching, circular bio-economy, environmental impact
Procedia PDF Downloads 15723949 NSBS: Design of a Network Storage Backup System
Authors: Xinyan Zhang, Zhipeng Tan, Shan Fan
Abstract:
The first layer of defense against data loss is the backup data. This paper implements an agent-based network backup system used the backup, server-storage and server-backup agent these tripartite construction, and we realize the snapshot and hierarchical index in the NSBS. It realizes the control command and data flow separation, balances the system load, thereby improving the efficiency of the system backup and recovery. The test results show the agent-based network backup system can effectively improve the task-based concurrency, reasonably allocate network bandwidth, the system backup performance loss costs smaller and improves data recovery efficiency by 20%.Keywords: agent, network backup system, three architecture model, NSBS
Procedia PDF Downloads 45923948 A t-SNE and UMAP Based Neural Network Image Classification Algorithm
Authors: Shelby Simpson, William Stanley, Namir Naba, Xiaodi Wang
Abstract:
Both t-SNE and UMAP are brand new state of art tools to predominantly preserve the local structure that is to group neighboring data points together, which indeed provides a very informative visualization of heterogeneity in our data. In this research, we develop a t-SNE and UMAP base neural network image classification algorithm to embed the original dataset to a corresponding low dimensional dataset as a preprocessing step, then use this embedded database as input to our specially designed neural network classifier for image classification. We use the fashion MNIST data set, which is a labeled data set of images of clothing objects in our experiments. t-SNE and UMAP are used for dimensionality reduction of the data set and thus produce low dimensional embeddings. Furthermore, we use the embeddings from t-SNE and UMAP to feed into two neural networks. The accuracy of the models from the two neural networks is then compared to a dense neural network that does not use embedding as an input to show which model can classify the images of clothing objects more accurately.Keywords: t-SNE, UMAP, fashion MNIST, neural networks
Procedia PDF Downloads 19823947 Improving the Ability of Constructed Wetlands to Treat Acid Mine Drainage
Authors: Chigbo Emmanuel Ikechukwu
Abstract:
Constructed wetlands are seen as a potential means of ameliorating the poor quality water that derives from coal and gold mining operations. However, the processes whereby a wetland environment is able to improve water quality are not well understood and techniques for optimising their performance poorly developed. A parameter that may be manipulated in order to improve the treatment capacity of a wetland is the substrate in which the aquatic plants are rooted. This substrate can provide an environment wherein sulphate reducing bacteria, which contribute to the removal of contaminants from the water, are able to flourish. The bacteria require an energy source which is largely provided by carbon in the substrate. This paper discusses the form in which carbon is most suitable for the bacteria and describes the results of a series of experiments in which different materials were used as substrate. Synthetic acid mine drainage was passed through an anaerobic bioreactor that contained either compost or cow manure. The effluent water quality was monitored with respect to time and the effect of the substrate composition discussed.Keywords: constructed wetland, bacteria, carbon, acid mine drainage, sulphate
Procedia PDF Downloads 44123946 An Online Adaptive Thresholding Method to Classify Google Trends Data Anomalies for Investor Sentiment Analysis
Authors: Duygu Dere, Mert Ergeneci, Kaan Gokcesu
Abstract:
Google Trends data has gained increasing popularity in the applications of behavioral finance, decision science and risk management. Because of Google’s wide range of use, the Trends statistics provide significant information about the investor sentiment and intention, which can be used as decisive factors for corporate and risk management fields. However, an anomaly, a significant increase or decrease, in a certain query cannot be detected by the state of the art applications of computation due to the random baseline noise of the Trends data, which is modelled as an Additive white Gaussian noise (AWGN). Since through time, the baseline noise power shows a gradual change an adaptive thresholding method is required to track and learn the baseline noise for a correct classification. To this end, we introduce an online method to classify meaningful deviations in Google Trends data. Through extensive experiments, we demonstrate that our method can successfully classify various anomalies for plenty of different data.Keywords: adaptive data processing, behavioral finance , convex optimization, online learning, soft minimum thresholding
Procedia PDF Downloads 16723945 Response of Subfossile Diatoms, Cladocera, and Chironomidae in Sediments of Small Ponds to Changes in Wastewater Discharges from a Zn–Pb Mine
Authors: Ewa Szarek-Gwiazda, Agata Z. Wojtal, Agnieszka Pociecha, Andrzej Kownacki, Dariusz Ciszewski
Abstract:
Mining of metal ores is one of the largest sources of heavy metals, which deteriorate aquatic systems. The response of organisms to environmental changes can be well recorded in sediments of the affected water bodies and may be reconstructed based on analyses of organisms' remains. The present study aimed at the response of diatoms (Bacillariophyta), Cladocera, and Chironomidae communities to the impact of Zn-Pb mine water discharge recorded in sediment cores of small subsidence ponds on the Chechło River floodplain (Silesia–Krakow Region, southern Poland). We hypothesize various responses of the above groups to high metal concentrations (Cd, Pb, Zn, and Cu). The investigated ponds were formed either during the peak of the ore exploitation (DOWN) or after mining cessation (UP). Currently, the concentrations of dissolved metals (in µg g⁻¹) in water reached up to 0.53 for Cd, 7.3 for Pb, and up to 47.1 for Zn. All the sediment cores from subsidence ponds were heavily polluted with Cd 6.7–612 μg g⁻¹, Pb 0.1–10.2 mg g⁻¹, and Zn 0.5–23.1 mg g⁻¹. Core sediments varied also in respect to pH 5.8-7.1 and concentrations of organic matter (5.7-39.8%). The impact of high metal concentrations was expressed by the occurrence of metal-tolerant taxa like diatoms – Nitzschia amphibia, Sellaphora nigri, and Surirella brebisonii var. kuetzingii; Cladocera – Chydorus sphaericus (dominated in cores from all ponds), and Chironomidae – Chironomus and Cricotopus especially in the DOWN ponds. Statistical analysis exhibited a negative impact of metals on some taxa of diatoms and Cladocera but only on Polypedilum sp. from Chironomidae. The abundance of such diatoms like Gomphonema utae, Staurosirella pinnata, Eunotia bilunaris, and Cladocera like Alona, Chydorus, Graptoleberis, and Pleuroxus decreased with increasing Pb concentration. However, the occurrence or dominance of more sensitive species of diatoms and Cladocera indicates their adaptation to higher metal loads, which was facilitated by neutral pH and slightly alkaline waters. Diatom assemblages were generally resistant to Zn, Pb, Cu, and Cd pollution, as indicated by their large similarity to populations from non-contaminated waters. Comparison with reference objects clearly indicates the dominance of Achnanthidium minutissimum, Staurosira venter, and Fragilaria gracilis in very diverse assemblages of unpolluted waters. The distribution of the Cladocera and Chironomidae taxa depended on the habitat type. The DOWN ponds with stagnant water and overgrown with macrophytes were more suitable for cladocerans (14 taxa, higher diversity) than the UP ponds with river water flowing through their centre and with a small share of macrophytes (8 taxa). The Chironominae, mainly Chironomus and Microspectra, were abundant in cores from the UP ponds with muddy bottoms. Inversely, the density of Orthocladiinae, especially genus Cricotopus, was related to the organic matter content and dominated in cores from the DOWN ponds. The presence of diatoms like Nitzschia amphibia, Sellaphora nigri, and Surirella brebisonii var. kuetzingii, cladocerans: Bosmina longirostris, Chydorus sphaericus, Alona affinis, and A. rectangularis as well as Chironomidae Chironomus sp. (UP ponds) and Psecrotanypus varius (DOWN ponds) indicate the influence of the water trophy on their distribution.Keywords: Chironomidae, Cladocera, diatoms, metals, Zn-Pb mine, sediment cores, subsidence ponds
Procedia PDF Downloads 7723944 Energy Efficient Assessment of Energy Internet Based on Data-Driven Fuzzy Integrated Cloud Evaluation Algorithm
Authors: Chuanbo Xu, Xinying Li, Gejirifu De, Yunna Wu
Abstract:
Energy Internet (EI) is a new form that deeply integrates the Internet and the entire energy process from production to consumption. The assessment of energy efficient performance is of vital importance for the long-term sustainable development of EI project. Although the newly proposed fuzzy integrated cloud evaluation algorithm considers the randomness of uncertainty, it relies too much on the experience and knowledge of experts. Fortunately, the enrichment of EI data has enabled the utilization of data-driven methods. Therefore, the main purpose of this work is to assess the energy efficient of park-level EI by using a combination of a data-driven method with the fuzzy integrated cloud evaluation algorithm. Firstly, the indicators for the energy efficient are identified through literature review. Secondly, the artificial neural network (ANN)-based data-driven method is employed to cluster the values of indicators. Thirdly, the energy efficient of EI project is calculated through the fuzzy integrated cloud evaluation algorithm. Finally, the applicability of the proposed method is demonstrated by a case study.Keywords: energy efficient, energy internet, data-driven, fuzzy integrated evaluation, cloud model
Procedia PDF Downloads 20223943 Surviral: An Agent-Based Simulation Framework for Sars-Cov-2 Outcome Prediction
Authors: Sabrina Neururer, Marco Schweitzer, Werner Hackl, Bernhard Tilg, Patrick Raudaschl, Andreas Huber, Bernhard Pfeifer
Abstract:
History and the current outbreak of Covid-19 have shown the deadly potential of infectious diseases. However, infectious diseases also have a serious impact on areas other than health and healthcare, such as the economy or social life. These areas are strongly codependent. Therefore, disease control measures, such as social distancing, quarantines, curfews, or lockdowns, have to be adopted in a very considerate manner. Infectious disease modeling can support policy and decision-makers with adequate information regarding the dynamics of the pandemic and therefore assist in planning and enforcing appropriate measures that will prevent the healthcare system from collapsing. In this work, an agent-based simulation package named “survival” for simulating infectious diseases is presented. A special focus is put on SARS-Cov-2. The presented simulation package was used in Austria to model the SARS-Cov-2 outbreak from the beginning of 2020. Agent-based modeling is a relatively recent modeling approach. Since our world is getting more and more complex, the complexity of the underlying systems is also increasing. The development of tools and frameworks and increasing computational power advance the application of agent-based models. For parametrizing the presented model, different data sources, such as known infections, wastewater virus load, blood donor antibodies, circulating virus variants and the used capacity for hospitalization, as well as the availability of medical materials like ventilators, were integrated with a database system and used. The simulation result of the model was used for predicting the dynamics and the possible outcomes and was used by the health authorities to decide on the measures to be taken in order to control the pandemic situation. The survival package was implemented in the programming language Java and the analytics were performed with R Studio. During the first run in March 2020, the simulation showed that without measures other than individual personal behavior and appropriate medication, the death toll would have been about 27 million people worldwide within the first year. The model predicted the hospitalization rates (standard and intensive care) for Tyrol and South Tyrol with an accuracy of about 1.5% average error. They were calculated to provide 10-days forecasts. The state government and the hospitals were provided with the 10-days models to support their decision-making. This ensured that standard care was maintained for as long as possible without restrictions. Furthermore, various measures were estimated and thereafter enforced. Among other things, communities were quarantined based on the calculations while, in accordance with the calculations, the curfews for the entire population were reduced. With this framework, which is used in the national crisis team of the Austrian province of Tyrol, a very accurate model could be created on the federal state level as well as on the district and municipal level, which was able to provide decision-makers with a solid information basis. This framework can be transferred to various infectious diseases and thus can be used as a basis for future monitoring.Keywords: modelling, simulation, agent-based, SARS-Cov-2, COVID-19
Procedia PDF Downloads 17423942 Graph Based Traffic Analysis and Delay Prediction Using a Custom Built Dataset
Authors: Gabriele Borg, Alexei Debono, Charlie Abela
Abstract:
There on a constant rise in the availability of high volumes of data gathered from multiple sources, resulting in an abundance of unprocessed information that can be used to monitor patterns and trends in user behaviour. Similarly, year after year, Malta is also constantly experiencing ongoing population growth and an increase in mobilization demand. This research takes advantage of data which is continuously being sourced and converting it into useful information related to the traffic problem on the Maltese roads. The scope of this paper is to provide a methodology to create a custom dataset (MalTra - Malta Traffic) compiled from multiple participants from various locations across the island to identify the most common routes taken to expose the main areas of activity. This use of big data is seen being used in various technologies and is referred to as ITSs (Intelligent Transportation Systems), which has been concluded that there is significant potential in utilising such sources of data on a nationwide scale. Furthermore, a series of traffic prediction graph neural network models are conducted to compare MalTra to large-scale traffic datasets.Keywords: graph neural networks, traffic management, big data, mobile data patterns
Procedia PDF Downloads 13123941 Learning Compression Techniques on Smart Phone
Authors: Farouk Lawan Gambo, Hamada Mohammad
Abstract:
Data compression shrinks files into fewer bits than their original presentation. It has more advantage on the internet because the smaller a file, the faster it can be transferred but learning most of the concepts in data compression are abstract in nature, therefore, making them difficult to digest by some students (engineers in particular). This paper studies the learning preference of engineering students who tend to have strong, active, sensing, visual and sequential learning preferences, the paper also studies the three shift of technology-aided that learning has experienced, which mobile learning has been considered to be the feature of learning that will integrate other form of the education process. Lastly, we propose a design and implementation of mobile learning application using software engineering methodology that will enhance the traditional teaching and learning of data compression techniques.Keywords: data compression, learning preference, mobile learning, multimedia
Procedia PDF Downloads 44723940 Investigation of Delivery of Triple Play Services
Authors: Paramjit Mahey, Monica Sharma, Jasbinder Singh
Abstract:
Fiber based access networks can deliver performance that can support the increasing demands for high speed connections. One of the new technologies that have emerged in recent years is Passive Optical Networks. This paper is targeted to show the simultaneous delivery of triple play service (data, voice and video). The comparative investigation and suitability of various data rates is presented. It is demonstrated that as we increase the data rate, number of users to be accommodated decreases due to increase in bit error rate.Keywords: BER, PON, TDMPON, GPON, CWDM, OLT, ONT
Procedia PDF Downloads 54123939 Nazca: A Context-Based Matching Method for Searching Heterogeneous Structures
Authors: Karine B. de Oliveira, Carina F. Dorneles
Abstract:
The structure level matching is the problem of combining elements of a structure, which can be represented as entities, classes, XML elements, web forms, and so on. This is a challenge due to large number of distinct representations of semantically similar structures. This paper describes a structure-based matching method applied to search for different representations in data sources, considering the similarity between elements of two structures and the data source context. Using real data sources, we have conducted an experimental study comparing our approach with our baseline implementation and with another important schema matching approach. We demonstrate that our proposal reaches higher precision than the baseline.Keywords: context, data source, index, matching, search, similarity, structure
Procedia PDF Downloads 36423938 Spatially Random Sampling for Retail Food Risk Factors Study
Authors: Guilan Huang
Abstract:
In 2013 and 2014, the U.S. Food and Drug Administration (FDA) collected data from selected fast food restaurants and full service restaurants for tracking changes in the occurrence of foodborne illness risk factors. This paper discussed how we customized spatial random sampling method by considering financial position and availability of FDA resources, and how we enriched restaurants data with location. Location information of restaurants provides opportunity for quantitatively determining random sampling within non-government units (e.g.: 240 kilometers around each data-collector). Spatial analysis also could optimize data-collectors’ work plans and resource allocation. Spatial analytic and processing platform helped us handling the spatial random sampling challenges. Our method fits in FDA’s ability to pinpoint features of foodservice establishments, and reduced both time and expense on data collection.Keywords: geospatial technology, restaurant, retail food risk factor study, spatially random sampling
Procedia PDF Downloads 35023937 Automatic MC/DC Test Data Generation from Software Module Description
Authors: Sekou Kangoye, Alexis Todoskoff, Mihaela Barreau
Abstract:
Modified Condition/Decision Coverage (MC/DC) is a structural coverage criterion that is highly recommended or required for safety-critical software coverage. Therefore, many testing standards include this criterion and require it to be satisfied at a particular level of testing (e.g. validation and unit levels). However, an important amount of time is needed to meet those requirements. In this paper we propose to automate MC/DC test data generation. Thus, we present an approach to automatically generate MC/DC test data, from software module description written over a dedicated language. We introduce a new merging approach that provides high MC/DC coverage for the description, with only a little number of test cases.Keywords: domain-specific language, MC/DC, test data generation, safety-critical software coverage
Procedia PDF Downloads 44123936 Blockchain-Based Approach on Security Enhancement of Distributed System in Healthcare Sector
Authors: Loong Qing Zhe, Foo Jing Heng
Abstract:
A variety of data files are now available on the internet due to the advancement of technology across the globe today. As more and more data are being uploaded on the internet, people are becoming more concerned that their private data, particularly medical health records, are being compromised and sold to others for money. Hence, the accessibility and confidentiality of patients' medical records have to be protected through electronic means. Blockchain technology is introduced to offer patients security against adversaries or unauthorised parties. In the blockchain network, only authorised personnel or organisations that have been validated as nodes may share information and data. For any change within the network, including adding a new block or modifying existing information about the block, a majority of two-thirds of the vote is required to confirm its legitimacy. Additionally, a consortium permission blockchain will connect all the entities within the same community. Consequently, all medical data in the network can be safely shared with all authorised entities. Also, synchronization can be performed within the cloud since the data is real-time. This paper discusses an efficient method for storing and sharing electronic health records (EHRs). It also examines the framework of roles within the blockchain and proposes a new approach to maintain EHRs with keyword indexes to search for patients' medical records while ensuring data privacy.Keywords: healthcare sectors, distributed system, blockchain, electronic health records (EHR)
Procedia PDF Downloads 19123935 Demographic Factors Influencing Employees’ Salary Expectations and Labor Turnover
Authors: M. Osipova
Abstract:
Thanks to informational technologies development every sphere of economics is becoming more and more data-centralized as people are generating huge datasets containing information on any aspect of their life. Applying research of such data to human resources management allows getting scarce statistics on labor market state including salary expectations and potential employees’ typical career behavior, and this information can become a reliable basis for management decisions. The following article presents results of career behavior research based on freely accessible resume data. Information used for study is much wider than one usually uses in human resources surveys. That is why there is enough data for statistically significant results even for subgroups analysis.Keywords: human resources management, salary expectations, statistics, turnover
Procedia PDF Downloads 34923934 Exploring Electroactive Polymers for Dynamic Data Physicalization
Authors: Joanna Dauner, Jan Friedrich, Linda Elsner, Kora Kimpel
Abstract:
Active materials such as Electroactive Polymers (EAPs) are promising for the development of novel shape-changing interfaces. This paper explores the potential of EAPs in a multilayer unimorph structure from a design perspective to investigate the visual qualities of the material for dynamic data visualization and data physicalization. We discuss various concepts of how the material can be used for this purpose. Multilayer unimorph EAPs are of particular interest to designers because they can be easily prototyped using everyday materials and tools. By changing the structure and geometry of the EAPs, their movement and behavior can be modified. We present the results of our preliminary user testing, where we evaluated different movement patterns. As a result, we introduce a prototype display built with EAPs for dynamic data physicalization. Finally, we discuss the potentials and drawbacks and identify further open research questions for the design discipline.Keywords: electroactive polymer, shape-changing interfaces, smart material interfaces, data physicalization
Procedia PDF Downloads 9923933 Research and Implementation of Cross-domain Data Sharing System in Net-centric Environment
Authors: Xiaoqing Wang, Jianjian Zong, Li Li, Yanxing Zheng, Jinrong Tong, Mao Zhan
Abstract:
With the rapid development of network and communication technology, a great deal of data has been generated in different domains of a network. These data show a trend of increasing scale and more complex structure. Therefore, an effective and flexible cross-domain data-sharing system is needed. The Cross-domain Data Sharing System(CDSS) in a net-centric environment is composed of three sub-systems. The data distribution sub-system provides data exchange service through publish-subscribe technology that supports asynchronism and multi-to-multi communication, which adapts to the needs of the dynamic and large-scale distributed computing environment. The access control sub-system adopts Attribute-Based Access Control(ABAC) technology to uniformly model various data attributes such as subject, object, permission and environment, which effectively monitors the activities of users accessing resources and ensures that legitimate users get effective access control rights within a legal time. The cross-domain access security negotiation subsystem automatically determines the access rights between different security domains in the process of interactive disclosure of digital certificates and access control policies through trust policy management and negotiation algorithms, which provides an effective means for cross-domain trust relationship establishment and access control in a distributed environment. The CDSS’s asynchronous,multi-to-multi and loosely-coupled communication features can adapt well to data exchange and sharing in dynamic, distributed and large-scale network environments. Next, we will give CDSS new features to support the mobile computing environment.Keywords: data sharing, cross-domain, data exchange, publish-subscribe
Procedia PDF Downloads 12423932 Routing Protocol in Ship Dynamic Positioning Based on WSN Clustering Data Fusion System
Authors: Zhou Mo, Dennis Chow
Abstract:
In the dynamic positioning system (DPS) for vessels, the reliable information transmission between each note basically relies on the wireless protocols. From the perspective of cluster-based routing protocols for wireless sensor networks, the data fusion technology based on the sleep scheduling mechanism and remaining energy in network layer is proposed, which applies the sleep scheduling mechanism to the routing protocols, considering the remaining energy of node and location information when selecting cluster-head. The problem of uneven distribution of nodes in each cluster is solved by the Equilibrium. At the same time, Classified Forwarding Mechanism as well as Redelivery Policy strategy is adopted to avoid congestion in the transmission of huge amount of data, reduce the delay in data delivery and enhance the real-time response. In this paper, a simulation test is conducted to improve the routing protocols, which turn out to reduce the energy consumption of nodes and increase the efficiency of data delivery.Keywords: DPS for vessel, wireless sensor network, data fusion, routing protocols
Procedia PDF Downloads 52423931 Advanced Data Visualization Techniques for Effective Decision-making in Oil and Gas Exploration and Production
Authors: Deepak Singh, Rail Kuliev
Abstract:
This research article explores the significance of advanced data visualization techniques in enhancing decision-making processes within the oil and gas exploration and production domain. With the oil and gas industry facing numerous challenges, effective interpretation and analysis of vast and diverse datasets are crucial for optimizing exploration strategies, production operations, and risk assessment. The article highlights the importance of data visualization in managing big data, aiding the decision-making process, and facilitating communication with stakeholders. Various advanced data visualization techniques, including 3D visualization, augmented reality (AR), virtual reality (VR), interactive dashboards, and geospatial visualization, are discussed in detail, showcasing their applications and benefits in the oil and gas sector. The article presents case studies demonstrating the successful use of these techniques in optimizing well placement, real-time operations monitoring, and virtual reality training. Additionally, the article addresses the challenges of data integration and scalability, emphasizing the need for future developments in AI-driven visualization. In conclusion, this research emphasizes the immense potential of advanced data visualization in revolutionizing decision-making processes, fostering data-driven strategies, and promoting sustainable growth and improved operational efficiency within the oil and gas exploration and production industry.Keywords: augmented reality (AR), virtual reality (VR), interactive dashboards, real-time operations monitoring
Procedia PDF Downloads 8623930 The Data Quality Model for the IoT based Real-time Water Quality Monitoring Sensors
Authors: Rabbia Idrees, Ananda Maiti, Saurabh Garg, Muhammad Bilal Amin
Abstract:
IoT devices are the basic building blocks of IoT network that generate enormous volume of real-time and high-speed data to help organizations and companies to take intelligent decisions. To integrate this enormous data from multisource and transfer it to the appropriate client is the fundamental of IoT development. The handling of this huge quantity of devices along with the huge volume of data is very challenging. The IoT devices are battery-powered and resource-constrained and to provide energy efficient communication, these IoT devices go sleep or online/wakeup periodically and a-periodically depending on the traffic loads to reduce energy consumption. Sometime these devices get disconnected due to device battery depletion. If the node is not available in the network, then the IoT network provides incomplete, missing, and inaccurate data. Moreover, many IoT applications, like vehicle tracking and patient tracking require the IoT devices to be mobile. Due to this mobility, If the distance of the device from the sink node become greater than required, the connection is lost. Due to this disconnection other devices join the network for replacing the broken-down and left devices. This make IoT devices dynamic in nature which brings uncertainty and unreliability in the IoT network and hence produce bad quality of data. Due to this dynamic nature of IoT devices we do not know the actual reason of abnormal data. If data are of poor-quality decisions are likely to be unsound. It is highly important to process data and estimate data quality before bringing it to use in IoT applications. In the past many researchers tried to estimate data quality and provided several Machine Learning (ML), stochastic and statistical methods to perform analysis on stored data in the data processing layer, without focusing the challenges and issues arises from the dynamic nature of IoT devices and how it is impacting data quality. A comprehensive review on determining the impact of dynamic nature of IoT devices on data quality is done in this research and presented a data quality model that can deal with this challenge and produce good quality of data. This research presents the data quality model for the sensors monitoring water quality. DBSCAN clustering and weather sensors are used in this research to make data quality model for the sensors monitoring water quality. An extensive study has been done in this research on finding the relationship between the data of weather sensors and sensors monitoring water quality of the lakes and beaches. The detailed theoretical analysis has been presented in this research mentioning correlation between independent data streams of the two sets of sensors. With the help of the analysis and DBSCAN, a data quality model is prepared. This model encompasses five dimensions of data quality: outliers’ detection and removal, completeness, patterns of missing values and checks the accuracy of the data with the help of cluster’s position. At the end, the statistical analysis has been done on the clusters formed as the result of DBSCAN, and consistency is evaluated through Coefficient of Variation (CoV).Keywords: clustering, data quality, DBSCAN, and Internet of things (IoT)
Procedia PDF Downloads 139