Search results for: data harvesting
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25495

Search results for: data harvesting

24235 Meta Mask Correction for Nuclei Segmentation in Histopathological Image

Authors: Jiangbo Shi, Zeyu Gao, Chen Li

Abstract:

Nuclei segmentation is a fundamental task in digital pathology analysis and can be automated by deep learning-based methods. However, the development of such an automated method requires a large amount of data with precisely annotated masks which is hard to obtain. Training with weakly labeled data is a popular solution for reducing the workload of annotation. In this paper, we propose a novel meta-learning-based nuclei segmentation method which follows the label correction paradigm to leverage data with noisy masks. Specifically, we design a fully conventional meta-model that can correct noisy masks by using a small amount of clean meta-data. Then the corrected masks are used to supervise the training of the segmentation model. Meanwhile, a bi-level optimization method is adopted to alternately update the parameters of the main segmentation model and the meta-model. Extensive experimental results on two nuclear segmentation datasets show that our method achieves the state-of-the-art result. In particular, in some noise scenarios, it even exceeds the performance of training on supervised data.

Keywords: deep learning, histopathological image, meta-learning, nuclei segmentation, weak annotations

Procedia PDF Downloads 140
24234 Feature Selection Approach for the Classification of Hydraulic Leakages in Hydraulic Final Inspection using Machine Learning

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Manufacturing companies are facing global competition and enormous cost pressure. The use of machine learning applications can help reduce production costs and create added value. Predictive quality enables the securing of product quality through data-supported predictions using machine learning models as a basis for decisions on test results. Furthermore, machine learning methods are able to process large amounts of data, deal with unfavourable row-column ratios and detect dependencies between the covariates and the given target as well as assess the multidimensional influence of all input variables on the target. Real production data are often subject to highly fluctuating boundary conditions and unbalanced data sets. Changes in production data manifest themselves in trends, systematic shifts, and seasonal effects. Thus, Machine learning applications require intensive pre-processing and feature selection. Data preprocessing includes rule-based data cleaning, the application of dimensionality reduction techniques, and the identification of comparable data subsets. Within the used real data set of Bosch hydraulic valves, the comparability of the same production conditions in the production of hydraulic valves within certain time periods can be identified by applying the concept drift method. Furthermore, a classification model is developed to evaluate the feature importance in different subsets within the identified time periods. By selecting comparable and stable features, the number of features used can be significantly reduced without a strong decrease in predictive power. The use of cross-process production data along the value chain of hydraulic valves is a promising approach to predict the quality characteristics of workpieces. In this research, the ada boosting classifier is used to predict the leakage of hydraulic valves based on geometric gauge blocks from machining, mating data from the assembly, and hydraulic measurement data from end-of-line testing. In addition, the most suitable methods are selected and accurate quality predictions are achieved.

Keywords: classification, achine learning, predictive quality, feature selection

Procedia PDF Downloads 162
24233 Organic Fertilizers Mitigate Microplastics Toxicity in Agricultural Soil

Authors: Ghulam Abbas Shah, Maqsood Sadiq, Ahsan Yasin

Abstract:

Massive global plastic production, combined with poor degradation and recycling, leads to significant environmental pollution from microplastics, whose effects on plants in the soil remain understudied. Besides, effective mitigation strategies and their impact on ammonia (NH₃) emissions under varying fertilizer management practices remains sketchy. Therefore, the objectives of the study were (i) to determine the impact of organic fertilizers on the toxicity of microplastics in sorghum and physicochemical characteristics of microplastics-contaminated soil and (ii) to assess the impacts of these fertilizers on NH₃ emissions from this soil. A field experiment was conducted using sorghum as a test crop. Treatments were: (i) Control (C), (ii) Microplastics (MP), (iii) Inorganic fertilizer (IF), (iv) MPIF, (v) Farmyard manure (FM), (vi) MPFM, (vii) Biochar (BC), and (viii) MPBC, arranged in a randomized complete block design (RCBD) with three replicates. Microplastics of polyvinyl chloride (PVC) were applied at a rate of 1.5 tons ha-¹, and all fertilizers were applied at the recommended dose of 90 kg N ha-¹. Soil sampling was done before sowing and after harvesting the sorghum, with samples analyzed for chemical properties and microbial biomass. Crop growth and yield attributes were measured. In a parallel pot experiment, NH₃ emissions were measured using passive flux samplers over 72 hours following the application of treatments similar to those used in the field experiment. Application of MPFM, MPBC and MPIF reduced soil mineral nitrogen by 8, 20 and 38% compared to their sole treatments, respectively. Microbial biomass carbon (MBC) was reduced by 19, 25 and 59% in MPIF, MPBC and MPFM as compared to their sole application, respectively. Similarly, the respective reduction in microbial biomass nitrogen (MBN) was 10, 27 and 66%. The toxicity of microplastics was mitigated by MPFM and MPBC, each with only a 5% reduction in grain yield of sorghum relative to their sole treatments. The differences in nitrogen uptake between BC vs. MPBC, FM vs. MPFM, and IF vs. MPIF were 8, 10, and 12 kg N ha-¹, respectively, indicating that organic fertilizers mitigate microplastic toxicity in the soil. NH₃ emission was reduced by 5, 11 and 20% after application of MPFM, MPBC and MPIF than their sole treatments, respectively. The study concludes that organic fertilizers such as FM and BC can effectively mitigate the toxicity of microplastics in soil, leading to improved crop growth and yield.

Keywords: microplastics, soil characteristics, crop n uptake, biochar, NH₃ emissions

Procedia PDF Downloads 39
24232 Secure Data Sharing of Electronic Health Records With Blockchain

Authors: Kenneth Harper

Abstract:

The secure sharing of Electronic Health Records (EHRs) is a critical challenge in modern healthcare, demanding solutions to enhance interoperability, privacy, and data integrity. Traditional standards like Health Information Exchange (HIE) and HL7 have made significant strides in facilitating data exchange between healthcare entities. However, these approaches rely on centralized architectures that are often vulnerable to data breaches, lack sufficient privacy measures, and have scalability issues. This paper proposes a framework for secure, decentralized sharing of EHRs using blockchain technology, cryptographic tokens, and Non-Fungible Tokens (NFTs). The blockchain's immutable ledger, decentralized control, and inherent security mechanisms are leveraged to improve transparency, accountability, and auditability in healthcare data exchanges. Furthermore, we introduce the concept of tokenizing patient data through NFTs, creating unique digital identifiers for each record, which allows for granular data access controls and proof of data ownership. These NFTs can also be employed to grant access to authorized parties, establishing a secure and transparent data sharing model that empowers both healthcare providers and patients. The proposed approach addresses common privacy concerns by employing privacy-preserving techniques such as zero-knowledge proofs (ZKPs) and homomorphic encryption to ensure that sensitive patient information can be shared without exposing the actual content of the data. This ensures compliance with regulations like HIPAA and GDPR. Additionally, the integration of Fast Healthcare Interoperability Resources (FHIR) with blockchain technology allows for enhanced interoperability, enabling healthcare organizations to exchange data seamlessly and securely across various systems while maintaining data governance and regulatory compliance. Through real-world case studies and simulations, this paper demonstrates how blockchain-based EHR sharing can reduce operational costs, improve patient outcomes, and enhance the security and privacy of healthcare data. This decentralized framework holds great potential for revolutionizing healthcare information exchange, providing a transparent, scalable, and secure method for managing patient data in a highly regulated environment.

Keywords: blockchain, electronic health records (ehrs), fast healthcare interoperability resources (fhir), health information exchange (hie), hl7, interoperability, non-fungible tokens (nfts), privacy-preserving techniques, tokens, secure data sharing,

Procedia PDF Downloads 21
24231 The Twin Terminal of Pedestrian Trajectory Based on City Intelligent Model (CIM) 4.0

Authors: Chen Xi, Liu Xuebing, Lao Xueru, Kuan Sinman, Jiang Yike, Wang Hanwei, Yang Xiaolang, Zhou Junjie, Xie Jinpeng

Abstract:

To further promote the development of smart cities, the microscopic "nerve endings" of the City Intelligent Model (CIM) are extended to be more sensitive. In this paper, we develop a pedestrian trajectory twin terminal based on the CIM and CNN technology. It also uses 5G networks, architectural and geoinformatics technologies, convolutional neural networks, combined with deep learning networks for human behavior recognition models, to provide empirical data such as 'pedestrian flow data and human behavioral characteristics data', and ultimately form spatial performance evaluation criteria and spatial performance warning systems, to make the empirical data accurate and intelligent for prediction and decision making.

Keywords: urban planning, urban governance, CIM, artificial intelligence, sustainable development

Procedia PDF Downloads 419
24230 Thin Film Thermoelectric Generator with Flexible Phase Change Material-Based Heatsink

Authors: Wu Peiqin

Abstract:

Flexible thermoelectric devices are light and flexible, which can be in close contact with any shape of heat source surfaces to minimize heat loss and achieve efficient energy conversion. Among the wide application fields, energy harvesting via flexible thermoelectric generators can adapt to a variety of curved heat sources (such as human body, circular tubes, and surfaces of different shapes) and can drive low-power electronic devices, exhibiting one of the most promising technologies in self-powered systems. The heat flux along the cross-section of the flexible thin-film generator is limited by the thickness, so the temperature difference decreases during the generation process, and the output power is low. At present, most of the heat flow directions of the thin film thermoelectric generator are along the thin-film plane; however, this method is not suitable for attaching to the human body surface to generate electricity. In order to make the film generator more suitable for thermoelectric generation, it is necessary to apply a flexible heatsink on the air sides with the film to maintain the temperature difference. In this paper, Bismuth telluride thermoelectric paste was deposited on polyimide flexible substrate by a screen printing method, and the flexible thermoelectric film was formed after drying. There are ten pairs of thermoelectric legs. The size of the thermoelectric leg is 20 x 2 x 0.1 mm, and adjacent thermoelectric legs are spaced 2 mm apart. A phase change material-based flexible heatsink was designed and fabricated. The flexible heatsink consists of n-octadecane, polystyrene, and expanded graphite. N-octadecane was used as the thermal storage material, polystyrene as the supporting material, and expanded graphite as the thermally conductive additive. The thickness of the flexible phase change material-based heatsink is 2mm. A thermoelectric performance testing platform was built, and its output performance was tested. The results show that the system can generate an open-circuit output voltage of 3.89 mV at a temperature difference of 10K, which is higher than the generator without a heatsink. Therefore, the flexible heatsink can increase the temperature difference between the two ends of the film and improve the output performance of the flexible film generator. This result promotes the application of the film thermoelectric generator in collecting human heat for power generation.

Keywords: flexible thermoelectric generator, screen printing, PCM, flexible heatsink

Procedia PDF Downloads 101
24229 An Extended Inverse Pareto Distribution, with Applications

Authors: Abdel Hadi Ebraheim

Abstract:

This paper introduces a new extension of the Inverse Pareto distribution in the framework of Marshal-Olkin (1997) family of distributions. This model is capable of modeling various shapes of aging and failure data. The statistical properties of the new model are discussed. Several methods are used to estimate the parameters involved. Explicit expressions are derived for different types of moments of value in reliability analysis are obtained. Besides, the order statistics of samples from the new proposed model have been studied. Finally, the usefulness of the new model for modeling reliability data is illustrated using two real data sets with simulation study.

Keywords: pareto distribution, marshal-Olkin, reliability, hazard functions, moments, estimation

Procedia PDF Downloads 82
24228 Potential Determinants of Research Output: Comparing Economics and Business

Authors: Osiris Jorge Parcero, Néstor Gandelman, Flavia Roldán, Josef Montag

Abstract:

This paper uses cross-country unbalanced panel data of up to 146 countries over the period 1996 to 2015 to be the first study to identify potential determinants of a country’s relative research output in Economics versus Business. More generally, it is also one of the first studies comparing Economics and Business. The results show that better policy-related data availability, higher income inequality, and lower ethnic fractionalization relatively favor economics. The findings are robust to two alternative fixed effects specifications, three alternative definitions of economics and business, two alternative measures of research output (publications and citations), and the inclusion of meaningful control variables. To the best of our knowledge, our paper is also the first to demonstrate the importance of policy-related data as drivers of economic research. Our regressions show that the availability of this type of data is the single most important factor associated with the prevalence of economics over business as a research domain. Thus, our work has policy implications, as the availability of policy-related data is partially under policy control. Moreover, it has implications for students, professionals, universities, university departments, and research-funding agencies that face choices between profiles oriented toward economics and those oriented toward business. Finally, the conclusions show potential lines for further research.

Keywords: research output, publication performance, bibliometrics, economics, business, policy-related data

Procedia PDF Downloads 134
24227 Assessment of Routine Health Information System (RHIS) Quality Assurance Practices in Tarkwa Sub-Municipal Health Directorate, Ghana

Authors: Richard Okyere Boadu, Judith Obiri-Yeboah, Kwame Adu Okyere Boadu, Nathan Kumasenu Mensah, Grace Amoh-Agyei

Abstract:

Routine health information system (RHIS) quality assurance has become an important issue, not only because of its significance in promoting a high standard of patient care but also because of its impact on government budgets for the maintenance of health services. A routine health information system comprises healthcare data collection, compilation, storage, analysis, report generation, and dissemination on a routine basis in various healthcare settings. The data from RHIS give a representation of health status, health services, and health resources. The sources of RHIS data are normally individual health records, records of services delivered, and records of health resources. Using reliable information from routine health information systems is fundamental in the healthcare delivery system. Quality assurance practices are measures that are put in place to ensure the health data that are collected meet required quality standards. Routine health information system quality assurance practices ensure that data that are generated from the system are fit for use. This study considered quality assurance practices in the RHIS processes. Methods: A cross-sectional study was conducted in eight health facilities in Tarkwa Sub-Municipal Health Service in the western region of Ghana. The study involved routine quality assurance practices among the 90 health staff and management selected from facilities in Tarkwa Sub-Municipal who collected or used data routinely from 24th December 2019 to 20th January 2020. Results: Generally, Tarkwa Sub-Municipal health service appears to practice quality assurance during data collection, compilation, storage, analysis and dissemination. The results show some achievement in quality control performance in report dissemination (77.6%), data analysis (68.0%), data compilation (67.4%), report compilation (66.3%), data storage (66.3%) and collection (61.1%). Conclusions: Even though the Tarkwa Sub-Municipal Health Directorate engages in some control measures to ensure data quality, there is a need to strengthen the process to achieve the targeted percentage of performance (90.0%). There was a significant shortfall in quality assurance practices performance, especially during data collection, with respect to the expected performance.

Keywords: quality assurance practices, assessment of routine health information system quality, routine health information system, data quality

Procedia PDF Downloads 79
24226 Heart Failure Identification and Progression by Classifying Cardiac Patients

Authors: Muhammad Saqlain, Nazar Abbas Saqib, Muazzam A. Khan

Abstract:

Heart Failure (HF) has become the major health problem in our society. The prevalence of HF has increased as the patient’s ages and it is the major cause of the high mortality rate in adults. A successful identification and progression of HF can be helpful to reduce the individual and social burden from this syndrome. In this study, we use a real data set of cardiac patients to propose a classification model for the identification and progression of HF. The data set has divided into three age groups, namely young, adult, and old and then each age group have further classified into four classes according to patient’s current physical condition. Contemporary Data Mining classification algorithms have been applied to each individual class of every age group to identify the HF. Decision Tree (DT) gives the highest accuracy of 90% and outperform all other algorithms. Our model accurately diagnoses different stages of HF for each age group and it can be very useful for the early prediction of HF.

Keywords: decision tree, heart failure, data mining, classification model

Procedia PDF Downloads 402
24225 Critically Analyzing the Application of Big Data for Smart Transportation: A Case Study of Mumbai

Authors: Tanuj Joshi

Abstract:

Smart transportation is fast emerging as a solution to modern cities’ approach mobility issues, delayed emergency response rate and high congestion on streets. Present day scenario with Google Maps, Waze, Yelp etc. demonstrates how information and communications technologies controls the intelligent transportation system. This intangible and invisible infrastructure is largely guided by the big data analytics. On the other side, the exponential increase in Indian urban population has intensified the demand for better services and infrastructure to satisfy the transportation needs of its citizens. No doubt, India’s huge internet usage is looked as an important resource to guide to achieve this. However, with a projected number of over 40 billion objects connected to the Internet by 2025, the need for systems to handle massive volume of data (big data) also arises. This research paper attempts to identify the ways of exploiting the big data variables which will aid commuters on Indian tracks. This study explores real life inputs by conducting survey and interviews to identify which gaps need to be targeted to better satisfy the customers. Several experts at Mumbai Metropolitan Region Development Authority (MMRDA), Mumbai Metro and Brihanmumbai Electric Supply and Transport (BEST) were interviewed regarding the Information Technology (IT) systems currently in use. The interviews give relevant insights and requirements into the workings of public transportation systems whereas the survey investigates the macro situation.

Keywords: smart transportation, mobility issue, Mumbai transportation, big data, data analysis

Procedia PDF Downloads 178
24224 Scientific Linux Cluster for BIG-DATA Analysis (SLBD): A Case of Fayoum University

Authors: Hassan S. Hussein, Rania A. Abul Seoud, Amr M. Refaat

Abstract:

Scientific researchers face in the analysis of very large data sets that is increasing noticeable rate in today’s and tomorrow’s technologies. Hadoop and Spark are types of software that developed frameworks. Hadoop framework is suitable for many Different hardware platforms. In this research, a scientific Linux cluster for Big Data analysis (SLBD) is presented. SLBD runs open source software with large computational capacity and high performance cluster infrastructure. SLBD composed of one cluster contains identical, commodity-grade computers interconnected via a small LAN. SLBD consists of a fast switch and Gigabit-Ethernet card which connect four (nodes). Cloudera Manager is used to configure and manage an Apache Hadoop stack. Hadoop is a framework allows storing and processing big data across the cluster by using MapReduce algorithm. MapReduce algorithm divides the task into smaller tasks which to be assigned to the network nodes. Algorithm then collects the results and form the final result dataset. SLBD clustering system allows fast and efficient processing of large amount of data resulting from different applications. SLBD also provides high performance, high throughput, high availability, expandability and cluster scalability.

Keywords: big data platforms, cloudera manager, Hadoop, MapReduce

Procedia PDF Downloads 358
24223 Investigating the Effects of Data Transformations on a Bi-Dimensional Chi-Square Test

Authors: Alexandru George Vaduva, Adriana Vlad, Bogdan Badea

Abstract:

In this research, we conduct a Monte Carlo analysis on a two-dimensional χ2 test, which is used to determine the minimum distance required for independent sampling in the context of chaotic signals. We investigate the impact of transforming initial data sets from any probability distribution to new signals with a uniform distribution using the Spearman rank correlation on the χ2 test. This transformation removes the randomness of the data pairs, and as a result, the observed distribution of χ2 test values differs from the expected distribution. We propose a solution to this problem and evaluate it using another chaotic signal.

Keywords: chaotic signals, logistic map, Pearson’s test, Chi Square test, bivariate distribution, statistical independence

Procedia PDF Downloads 97
24222 Real Time Data Communication with FlightGear Using Simulink Over a UDP Protocol

Authors: Adil Loya, Ali Haider, Arslan A. Ghaffor, Abubaker Siddique

Abstract:

Simulation and modelling of Unmanned Aero Vehicle (UAV) has gained wide popularity in front of aerospace community. The demand of designing and modelling optimized control system for UAV has increased ten folds since last decade. The reason is next generation warfare is dependent on unmanned technologies. Therefore, this research focuses on the simulation of nonlinear UAV dynamics on Simulink and its integration with Flightgear. There has been lots of research on implementation of optimizing control using Simulink, however, there are fewer known techniques to simulate these dynamics over Flightgear and a tedious technique of acquiring data has been tackled in this research horizon. Sending data to Flightgear is easy but receiving it from Simulink is not that straight forward, i.e. we can only receive control data on the output. However, in this research we have managed to get the data out from the Flightgear by implementation of level 2 s-function block within Simulink. Moreover, the results captured from Flightgear over a Universal Datagram Protocol (UDP) communication are then compared with the attitude signal that were sent previously. This provide useful information regarding the difference in outputs attained from Simulink to Flightgear. It was found that values received on Simulink were in high agreement with that of the Flightgear output. And complete study has been conducted in a discrete way.

Keywords: aerospace, flight control, flightgear, communication, Simulink

Procedia PDF Downloads 286
24221 Open Source, Open Hardware Ground Truth for Visual Odometry and Simultaneous Localization and Mapping Applications

Authors: Janusz Bedkowski, Grzegorz Kisala, Michal Wlasiuk, Piotr Pokorski

Abstract:

Ground-truth data is essential for VO (Visual Odometry) and SLAM (Simultaneous Localization and Mapping) quantitative evaluation using e.g. ATE (Absolute Trajectory Error) and RPE (Relative Pose Error). Many open-access data sets provide raw and ground-truth data for benchmark purposes. The issue appears when one would like to validate Visual Odometry and/or SLAM approaches on data captured using the device for which the algorithm is targeted for example mobile phone and disseminate data for other researchers. For this reason, we propose an open source, open hardware groundtruth system that provides an accurate and precise trajectory with a 3D point cloud. It is based on LiDAR Livox Mid-360 with a non-repetitive scanning pattern, on-board Raspberry Pi 4B computer, battery and software for off-line calculations (camera to LiDAR calibration, LiDAR odometry, SLAM, georeferencing). We show how this system can be used for the evaluation of various the state of the art algorithms (Stella SLAM, ORB SLAM3, DSO) in typical indoor monocular VO/SLAM.

Keywords: SLAM, ground truth, navigation, LiDAR, visual odometry, mapping

Procedia PDF Downloads 69
24220 Prediction of Gully Erosion with Stochastic Modeling by using Geographic Information System and Remote Sensing Data in North of Iran

Authors: Reza Zakerinejad

Abstract:

Gully erosion is a serious problem that threading the sustainability of agricultural area and rangeland and water in a large part of Iran. This type of water erosion is the main source of sedimentation in many catchment areas in the north of Iran. Since in many national assessment approaches just qualitative models were applied the aim of this study is to predict the spatial distribution of gully erosion processes by means of detail terrain analysis and GIS -based logistic regression in the loess deposition in a case study in the Golestan Province. This study the DEM with 25 meter result ion from ASTER data has been used. The Landsat ETM data have been used to mapping of land use. The TreeNet model as a stochastic modeling was applied to prediction the susceptible area for gully erosion. In this model ROC we have set 20 % of data as learning and 20 % as learning data. Therefore, applying the GIS and satellite image analysis techniques has been used to derive the input information for these stochastic models. The result of this study showed a high accurate map of potential for gully erosion.

Keywords: TreeNet model, terrain analysis, Golestan Province, Iran

Procedia PDF Downloads 535
24219 Data Science/Artificial Intelligence: A Possible Panacea for Refugee Crisis

Authors: Avi Shrivastava

Abstract:

In 2021, two heart-wrenching scenes, shown live on television screens across countries, painted a grim picture of refugees. One of them was of people clinging onto an airplane's wings in their desperate attempt to flee war-torn Afghanistan. They ultimately fell to their death. The other scene was the U.S. government authorities separating children from their parents or guardians to deter migrants/refugees from coming to the U.S. These events show the desperation refugees feel when they are trying to leave their homes in disaster zones. However, data paints a grave picture of the current refugee situation. It also indicates that a bleak future lies ahead for the refugees across the globe. Data and information are the two threads that intertwine to weave the shimmery fabric of modern society. Data and information are often used interchangeably, but they differ considerably. For example, information analysis reveals rationale, and logic, while data analysis, on the other hand, reveals a pattern. Moreover, patterns revealed by data can enable us to create the necessary tools to combat huge problems on our hands. Data analysis paints a clear picture so that the decision-making process becomes simple. Geopolitical and economic data can be used to predict future refugee hotspots. Accurately predicting the next refugee hotspots will allow governments and relief agencies to prepare better for future refugee crises. The refugee crisis does not have binary answers. Given the emotionally wrenching nature of the ground realities, experts often shy away from realistically stating things as they are. This hesitancy can cost lives. When decisions are based solely on data, emotions can be removed from the decision-making process. Data also presents irrefutable evidence and tells whether there is a solution or not. Moreover, it also responds to a nonbinary crisis with a binary answer. Because of all that, it becomes easier to tackle a problem. Data science and A.I. can predict future refugee crises. With the recent explosion of data due to the rise of social media platforms, data and insight into data has solved many social and political problems. Data science can also help solve many issues refugees face while staying in refugee camps or adopted countries. This paper looks into various ways data science can help solve refugee problems. A.I.-based chatbots can help refugees seek legal help to find asylum in the country they want to settle in. These chatbots can help them find a marketplace where they can find help from the people willing to help. Data science and technology can also help solve refugees' many problems, including food, shelter, employment, security, and assimilation. The refugee problem seems to be one of the most challenging for social and political reasons. Data science and machine learning can help prevent the refugee crisis and solve or alleviate some of the problems that refugees face in their journey to a better life. With the explosion of data in the last decade, data science has made it possible to solve many geopolitical and social issues.

Keywords: refugee crisis, artificial intelligence, data science, refugee camps, Afghanistan, Ukraine

Procedia PDF Downloads 72
24218 A Spatial Point Pattern Analysis to Recognize Fail Bit Patterns in Semiconductor Manufacturing

Authors: Youngji Yoo, Seung Hwan Park, Daewoong An, Sung-Shick Kim, Jun-Geol Baek

Abstract:

The yield management system is very important to produce high-quality semiconductor chips in the semiconductor manufacturing process. In order to improve quality of semiconductors, various tests are conducted in the post fabrication (FAB) process. During the test process, large amount of data are collected and the data includes a lot of information about defect. In general, the defect on the wafer is the main causes of yield loss. Therefore, analyzing the defect data is necessary to improve performance of yield prediction. The wafer bin map (WBM) is one of the data collected in the test process and includes defect information such as the fail bit patterns. The fail bit has characteristics of spatial point patterns. Therefore, this paper proposes the feature extraction method using the spatial point pattern analysis. Actual data obtained from the semiconductor process is used for experiments and the experimental result shows that the proposed method is more accurately recognize the fail bit patterns.

Keywords: semiconductor, wafer bin map, feature extraction, spatial point patterns, contour map

Procedia PDF Downloads 384
24217 The Measurement of the Multi-Period Efficiency of the Turkish Health Care Sector

Authors: Erhan Berk

Abstract:

The purpose of this study is to examine the efficiency and productivity of the health care sector in Turkey based on four years of health care cross-sectional data. Efficiency measures are calculated by a nonparametric approach known as Data Envelopment Analysis (DEA). Productivity is measured by the Malmquist index. The research shows how DEA-based Malmquist productivity index can be operated to appraise the technology and productivity changes resulted in the Turkish hospitals which are located all across the country.

Keywords: data envelopment analysis, efficiency, health care, Malmquist Index

Procedia PDF Downloads 335
24216 Comparison Of Data Mining Models To Predict Future Bridge Conditions

Authors: Pablo Martinez, Emad Mohamed, Osama Mohsen, Yasser Mohamed

Abstract:

Highway and bridge agencies, such as the Ministry of Transportation in Ontario, use the Bridge Condition Index (BCI) which is defined as the weighted condition of all bridge elements to determine the rehabilitation priorities for its bridges. Therefore, accurate forecasting of BCI is essential for bridge rehabilitation budgeting planning. The large amount of data available in regard to bridge conditions for several years dictate utilizing traditional mathematical models as infeasible analysis methods. This research study focuses on investigating different classification models that are developed to predict the bridge condition index in the province of Ontario, Canada based on the publicly available data for 2800 bridges over a period of more than 10 years. The data preparation is a key factor to develop acceptable classification models even with the simplest one, the k-NN model. All the models were tested, compared and statistically validated via cross validation and t-test. A simple k-NN model showed reasonable results (within 0.5% relative error) when predicting the bridge condition in an incoming year.

Keywords: asset management, bridge condition index, data mining, forecasting, infrastructure, knowledge discovery in databases, maintenance, predictive models

Procedia PDF Downloads 191
24215 Piql Preservation Services - A Holistic Approach to Digital Long-Term Preservation

Authors: Alexander Rych

Abstract:

Piql Preservation Services (“Piql”) is a turnkey solution designed for secure, migration-free long- term preservation of digital data. Piql sets an open standard for long- term preservation for the future. It consists of equipment and processes needed for writing and retrieving digital data. Exponentially growing amounts of data demand for logistically effective and cost effective processes. Digital storage media (hard disks, magnetic tape) exhibit limited lifetime. Repetitive data migration to overcome rapid obsolescence of hardware and software bears accelerated risk of data loss, data corruption or even manipulation and adds significant repetitive costs for hardware and software investments. Piql stores any kind of data in its digital as well as analog form securely for 500 years. The medium that provides this is a film reel. Using photosensitive film polyester base, a very stable material that is known for its immutability over hundreds of years, secure and cost-effective long- term preservation can be provided. The film reel itself is stored in a packaging capable of protecting the optical storage medium. These components have undergone extensive testing to ensure longevity of up to 500 years. In addition to its durability, film is a true WORM (write once- read many) medium. It therefore is resistant to editing or manipulation. Being able to store any form of data onto the film makes Piql a superior solution for long-term preservation. Paper documents, images, video or audio sequences – all of those file formats and documents can be preserved in its native file structure. In order to restore the encoded digital data, only a film scanner, a digital camera or any appropriate optical reading device will be needed in the future. Every film reel includes an index section describing the data saved on the film. It also contains a content section carrying meta-data, enabling users in the future to rebuild software in order to read and decode the digital information.

Keywords: digital data, long-term preservation, migration-free, photosensitive film

Procedia PDF Downloads 392
24214 Statistical Correlation between Logging-While-Drilling Measurements and Wireline Caliper Logs

Authors: Rima T. Alfaraj, Murtadha J. Al Tammar, Khaqan Khan, Khalid M. Alruwaili

Abstract:

OBJECTIVE/SCOPE (25-75): Caliper logging data provides critical information about wellbore shape and deformations, such as stress-induced borehole breakouts or washouts. Multiarm mechanical caliper logs are often run using wireline, which can be time-consuming, costly, and/or challenging to run in certain formations. To minimize rig time and improve operational safety, it is valuable to develop analytical solutions that can estimate caliper logs using available Logging-While-Drilling (LWD) data without the need to run wireline caliper logs. As a first step, the objective of this paper is to perform statistical analysis using an extensive datasetto identify important physical parameters that should be considered in developing such analytical solutions. METHODS, PROCEDURES, PROCESS (75-100): Caliper logs and LWD data of eleven wells, with a total of more than 80,000 data points, were obtained and imported into a data analytics software for analysis. Several parameters were selected to test the relationship of the parameters with the measured maximum and minimum caliper logs. These parameters includegamma ray, porosity, shear, and compressional sonic velocities, bulk densities, and azimuthal density. The data of the eleven wells were first visualized and cleaned.Using the analytics software, several analyses were then preformed, including the computation of Pearson’s correlation coefficients to show the statistical relationship between the selected parameters and the caliper logs. RESULTS, OBSERVATIONS, CONCLUSIONS (100-200): The results of this statistical analysis showed that some parameters show good correlation to the caliper log data. For instance, the bulk density and azimuthal directional densities showedPearson’s correlation coefficients in the range of 0.39 and 0.57, which wererelatively high when comparedto the correlation coefficients of caliper data with other parameters. Other parameters such as porosity exhibited extremely low correlation coefficients to the caliper data. Various crossplots and visualizations of the data were also demonstrated to gain further insights from the field data. NOVEL/ADDITIVE INFORMATION (25-75): This study offers a unique and novel look into the relative importance and correlation between different LWD measurements and wireline caliper logs via an extensive dataset. The results pave the way for a more informed development of new analytical solutions for estimating the size and shape of the wellbore in real-time while drilling using LWD data.

Keywords: LWD measurements, caliper log, correlations, analysis

Procedia PDF Downloads 121
24213 Inversion of Gravity Data for Density Reconstruction

Authors: Arka Roy, Chandra Prakash Dubey

Abstract:

Inverse problem generally used for recovering hidden information from outside available data. Vertical component of gravity field we will be going to use for underneath density structure calculation. Ill-posing nature is main obstacle for any inverse problem. Linear regularization using Tikhonov formulation are used for appropriate choice of SVD and GSVD components. For real time data handle, signal to noise ratios should have to be less for reliable solution. In our study, 2D and 3D synthetic model with rectangular grid are used for gravity field calculation and its corresponding inversion for density reconstruction. Fine grid also we have considered to hold any irregular structure. Keeping in mind of algebraic ambiguity factor number of observation point should be more than that of number of data point. Picard plot is represented here for choosing appropriate or main controlling Eigenvalues for a regularized solution. Another important study is depth resolution plot (DRP). DRP are generally used for studying how the inversion is influenced by regularizing or discretizing. Our further study involves real time gravity data inversion of Vredeforte Dome South Africa. We apply our method to this data. The results include density structure is in good agreement with known formation in that region, which puts an additional support of our method.

Keywords: depth resolution plot, gravity inversion, Picard plot, SVD, Tikhonov formulation

Procedia PDF Downloads 212
24212 DeepOmics: Deep Learning for Understanding Genome Functioning and the Underlying Genetic Causes of Disease

Authors: Vishnu Pratap Singh Kirar, Madhuri Saxena

Abstract:

Advancement in sequence data generation technologies is churning out voluminous omics data and posing a massive challenge to annotate the biological functional features. With so much data available, the use of machine learning methods and tools to make novel inferences has become obvious. Machine learning methods have been successfully applied to a lot of disciplines, including computational biology and bioinformatics. Researchers in computational biology are interested to develop novel machine learning frameworks to classify the huge amounts of biological data. In this proposal, it plan to employ novel machine learning approaches to aid the understanding of how apparently innocuous mutations (in intergenic DNA and at synonymous sites) cause diseases. We are also interested in discovering novel functional sites in the genome and mutations in which can affect a phenotype of interest.

Keywords: genome wide association studies (GWAS), next generation sequencing (NGS), deep learning, omics

Procedia PDF Downloads 97
24211 An Efficient Data Mining Technique for Online Stores

Authors: Mohammed Al-Shalabi, Alaa Obeidat

Abstract:

In any food stores, some items will be expired or destroyed because the demand on these items is infrequent, so we need a system that can help the decision maker to make an offer on such items to improve the demand on the items by putting them with some other frequent item and decrease the price to avoid losses. The system generates hundreds or thousands of patterns (offers) for each low demand item, then it uses the association rules (support, confidence) to find the interesting patterns (the best offer to achieve the lowest losses). In this paper, we propose a data mining method for determining the best offer by merging the data mining techniques with the e-commerce strategy. The task is to build a model to predict the best offer. The goal is to maximize the profits of a store and avoid the loss of products. The idea in this paper is the using of the association rules in marketing with a combination with e-commerce.

Keywords: data mining, association rules, confidence, online stores

Procedia PDF Downloads 410
24210 Elemental Graph Data Model: A Semantic and Topological Representation of Building Elements

Authors: Yasmeen A. S. Essawy, Khaled Nassar

Abstract:

With the rapid increase of complexity in the building industry, professionals in the A/E/C industry were forced to adopt Building Information Modeling (BIM) in order to enhance the communication between the different project stakeholders throughout the project life cycle and create a semantic object-oriented building model that can support geometric-topological analysis of building elements during design and construction. This paper presents a model that extracts topological relationships and geometrical properties of building elements from an existing fully designed BIM, and maps this information into a directed acyclic Elemental Graph Data Model (EGDM). The model incorporates BIM-based search algorithms for automatic deduction of geometrical data and topological relationships for each building element type. Using graph search algorithms, such as Depth First Search (DFS) and topological sortings, all possible construction sequences can be generated and compared against production and construction rules to generate an optimized construction sequence and its associated schedule. The model is implemented in a C# platform.

Keywords: building information modeling (BIM), elemental graph data model (EGDM), geometric and topological data models, graph theory

Procedia PDF Downloads 382
24209 Shaped Crystal Growth of Fe-Ga and Fe-Al Alloy Plates by the Micro Pulling down Method

Authors: Kei Kamada, Rikito Murakami, Masahiko Ito, Mototaka Arakawa, Yasuhiro Shoji, Toshiyuki Ueno, Masao Yoshino, Akihiro Yamaji, Shunsuke Kurosawa, Yuui Yokota, Yuji Ohashi, Akira Yoshikawa

Abstract:

Techniques of energy harvesting y have been widely developed in recent years, due to high demand on the power supply for ‘Internet of things’ devices such as wireless sensor nodes. In these applications, conversion technique of mechanical vibration energy into electrical energy using magnetostrictive materials n have been brought to attention. Among the magnetostrictive materials, Fe-Ga and Fe-Al alloys are attractive materials due to the figure of merits such price, mechanical strength, high magnetostrictive constant. Up to now, bulk crystals of these alloys are produced by the Bridgman–Stockbarger method or the Czochralski method. Using these method big bulk crystal up to 2~3 inch diameter can be grown. However, non-uniformity of chemical composition along to the crystal growth direction cannot be avoid, which results in non-uniformity of magnetostriction constant and reduction of the production yield. The micro-pulling down (μ-PD) method has been developed as a shaped crystal growth technique. Our group have reported shaped crystal growth of oxide, fluoride single crystals with different shape such rod, plate tube, thin fiber, etc. Advantages of this method is low segregation due to high growth rate and small diffusion of melt at the solid-liquid interface, and small kerf loss due to near net shape crystal. In this presentation, we report the shaped long plate crystal growth of Fe-Ga and Fe-Al alloys using the μ-PD method. Alloy crystals were grown by the μ-PD method using calcium oxide crucible and induction heating system under the nitrogen atmosphere. The bottom hole of crucibles was 5 x 1mm² size. A <100> oriented iron-based alloy was used as a seed crystal. 5 x 1 x 320 mm³ alloy crystal plates were successfully grown. The results of crystal growth, chemical composition analysis, magnetostrictive properties and a prototype vibration energy harvester are reported. Furthermore, continuous crystal growth using powder supply system will be reported to minimize the chemical composition non-uniformity along the growth direction.

Keywords: crystal growth, micro-pulling-down method, Fe-Ga, Fe-Al

Procedia PDF Downloads 334
24208 Wireless Sensor Network for Forest Fire Detection and Localization

Authors: Tarek Dandashi

Abstract:

WSNs may provide a fast and reliable solution for the early detection of environment events like forest fires. This is crucial for alerting and calling for fire brigade intervention. Sensor nodes communicate sensor data to a host station, which enables a global analysis and the generation of a reliable decision on a potential fire and its location. A WSN with TinyOS and nesC for the capturing and transmission of a variety of sensor information with controlled source, data rates, duration, and the records/displaying activity traces is presented. We propose a similarity distance (SD) between the distribution of currently sensed data and that of a reference. At any given time, a fire causes diverging opinions in the reported data, which alters the usual data distribution. Basically, SD consists of a metric on the Cumulative Distribution Function (CDF). SD is designed to be invariant versus day-to-day changes of temperature, changes due to the surrounding environment, and normal changes in weather, which preserve the data locality. Evaluation shows that SD sensitivity is quadratic versus an increase in sensor node temperature for a group of sensors of different sizes and neighborhood. Simulation of fire spreading when ignition is placed at random locations with some wind speed shows that SD takes a few minutes to reliably detect fires and locate them. We also discuss the case of false negative and false positive and their impact on the decision reliability.

Keywords: forest fire, WSN, wireless sensor network, algortihm

Procedia PDF Downloads 262
24207 A Feasibility Study of Crowdsourcing Data Collection for Facility Maintenance Management

Authors: Mohamed Bin Alhaj, Hexu Liu, Mohammed Sulaiman, Osama Abudayyeh

Abstract:

An effective facility maintenance management (FMM) system plays a crucial role in improving the quality of services and maintaining the facility in good condition. Current FMM heavily relies on the quality of the data collection function of the FMM systems, at times resulting in inefficient FMM decision-making. The new technology-based crowdsourcing provides great potential to improve the current FMM practices, especially in terms of timeliness and quality of data. This research aims to investigate the feasibility of using new technology-driven crowdsourcing for FMM and highlight its opportunities and challenges. A survey was carried out to understand the human, data, system, geospatial, and automation characteristics of crowdsourcing for an educational campus FMM via social networks. The survey results were analyzed to reveal the challenges and recommendations for the implementation of crowdsourcing for FMM. This research contributes to the body of knowledge by synthesizing the challenges and opportunities of using crowdsourcing for facility maintenance and providing a road map for applying crowdsourcing technology in FMM. In future work, a conceptual framework will be proposed to support data-driven FMM using social networks.

Keywords: crowdsourcing, facility maintenance management, social networks

Procedia PDF Downloads 173
24206 Challenges and Opportunities: One Stop Processing for the Automation of Indonesian Large-Scale Topographic Base Map Using Airborne LiDAR Data

Authors: Elyta Widyaningrum

Abstract:

The LiDAR data acquisition has been recognizable as one of the fastest solution to provide the basis data for topographic base mapping in Indonesia. The challenges to accelerate the provision of large-scale topographic base maps as a development plan basis gives the opportunity to implement the automated scheme in the map production process. The one stop processing will also contribute to accelerate the map provision especially to conform with the Indonesian fundamental spatial data catalog derived from ISO 19110 and geospatial database integration. Thus, the automated LiDAR classification, DTM generation and feature extraction will be conducted in one GIS-software environment to form all layers of topographic base maps. The quality of automated topographic base map will be assessed and analyzed based on its completeness, correctness, contiguity, consistency and possible customization.

Keywords: automation, GIS environment, LiDAR processing, map quality

Procedia PDF Downloads 368