Search results for: atomic data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25860

Search results for: atomic data

24450 Machine Learning Strategies for Data Extraction from Unstructured Documents in Financial Services

Authors: Delphine Vendryes, Dushyanth Sekhar, Baojia Tong, Matthew Theisen, Chester Curme

Abstract:

Much of the data that inform the decisions of governments, corporations and individuals are harvested from unstructured documents. Data extraction is defined here as a process that turns non-machine-readable information into a machine-readable format that can be stored, for instance, in a database. In financial services, introducing more automation in data extraction pipelines is a major challenge. Information sought by financial data consumers is often buried within vast bodies of unstructured documents, which have historically required thorough manual extraction. Automated solutions provide faster access to non-machine-readable datasets, in a context where untimely information quickly becomes irrelevant. Data quality standards cannot be compromised, so automation requires high data integrity. This multifaceted task is broken down into smaller steps: ingestion, table parsing (detection and structure recognition), text analysis (entity detection and disambiguation), schema-based record extraction, user feedback incorporation. Selected intermediary steps are phrased as machine learning problems. Solutions leveraging cutting-edge approaches from the fields of computer vision (e.g. table detection) and natural language processing (e.g. entity detection and disambiguation) are proposed.

Keywords: computer vision, entity recognition, finance, information retrieval, machine learning, natural language processing

Procedia PDF Downloads 115
24449 Regression Approach for Optimal Purchase of Hosts Cluster in Fixed Fund for Hadoop Big Data Platform

Authors: Haitao Yang, Jianming Lv, Fei Xu, Xintong Wang, Yilin Huang, Lanting Xia, Xuewu Zhu

Abstract:

Given a fixed fund, purchasing fewer hosts of higher capability or inversely more of lower capability is a must-be-made trade-off in practices for building a Hadoop big data platform. An exploratory study is presented for a Housing Big Data Platform project (HBDP), where typical big data computing is with SQL queries of aggregate, join, and space-time condition selections executed upon massive data from more than 10 million housing units. In HBDP, an empirical formula was introduced to predict the performance of host clusters potential for the intended typical big data computing, and it was shaped via a regression approach. With this empirical formula, it is easy to suggest an optimal cluster configuration. The investigation was based on a typical Hadoop computing ecosystem HDFS+Hive+Spark. A proper metric was raised to measure the performance of Hadoop clusters in HBDP, which was tested and compared with its predicted counterpart, on executing three kinds of typical SQL query tasks. Tests were conducted with respect to factors of CPU benchmark, memory size, virtual host division, and the number of element physical host in cluster. The research has been applied to practical cluster procurement for housing big data computing.

Keywords: Hadoop platform planning, optimal cluster scheme at fixed-fund, performance predicting formula, typical SQL query tasks

Procedia PDF Downloads 232
24448 Model Predictive Controller for Pasteurization Process

Authors: Tesfaye Alamirew Dessie

Abstract:

Our study focuses on developing a Model Predictive Controller (MPC) and evaluating it against a traditional PID for a pasteurization process. Utilizing system identification from the experimental data, the dynamics of the pasteurization process were calculated. Using best fit with data validation, residual, and stability analysis, the quality of several model architectures was evaluated. The validation data fit the auto-regressive with exogenous input (ARX322) model of the pasteurization process by roughly 80.37 percent. The ARX322 model structure was used to create MPC and PID control techniques. After comparing controller performance based on settling time, overshoot percentage, and stability analysis, it was found that MPC controllers outperform PID for those parameters.

Keywords: MPC, PID, ARX, pasteurization

Procedia PDF Downloads 165
24447 Peculiarities of Absorption near the Edge of the Fundamental Band of Irradiated InAs-InP Solid Solutions

Authors: Nodar Kekelidze, David Kekelidze, Elza Khutsishvili, Bela Kvirkvelia

Abstract:

The semiconductor devices are irreplaceable elements for investigations in Space (artificial Earth satellite, interplanetary space craft, probes, rockets) and for investigation of elementary particles on accelerators, for atomic power stations, nuclear reactors, robots operating on heavily radiation contaminated territories (Chernobyl, Fukushima). Unfortunately, the most important parameters of semiconductors dramatically worsen under irradiation. So creation of radiation-resistant semiconductor materials for opto and microelectronic devices is actual problem, as well as investigation of complicated processes developed in irradiated solid states. Homogeneous single crystals of InP-InAs solid solutions were grown with zone melting method. There has been studied the dependence of the optical absorption coefficient vs photon energy near fundamental absorption edge. This dependence changes dramatically with irradiation. The experiments were performed on InP, InAs and InP-InAs solid solutions before and after irradiation with electrons and fast neutrons. The investigations of optical properties were carried out on infrared spectrophotometer in temperature range of 10K-300K and 1mkm-50mkm spectral area. Radiation fluencies of fast neutrons was equal to 2·1018neutron/cm2 and electrons with 3MeV, 50MeV up to fluxes of 6·1017electron/cm2. Under irradiation, there has been revealed the exponential type of the dependence of the optical absorption coefficient vs photon energy with energy deficiency. The indicated phenomenon takes place at high and low temperatures as well at impurity different concentration and practically in all cases of irradiation by various energy electrons and fast neutrons. We have developed the common mechanism of this phenomenon for unirradiated materials and implemented the quantitative calculations of distinctive parameter; this is in a satisfactory agreement with experimental data. For the irradiated crystals picture get complicated. In the work, the corresponding analysis is carried out. It has been shown, that in the case of InP, irradiated with electrons (Ф=1·1017el/cm2), the curve of optical absorption is shifted to lower energies. This is caused by appearance of the tails of density of states in forbidden band due to local fluctuations of ionized impurity (defect) concentration. Situation is more complicated in the case of InAs and for solid solutions with composition near to InAs when besides noticeable phenomenon there takes place Burstein effect caused by increase of electrons concentration as a result of irradiation. We have shown, that in certain conditions it is possible the prevalence of Burstein effect. This causes the opposite effect: the shift of the optical absorption edge to higher energies. So in given solid solutions there take place two different opposite directed processes. By selection of solid solutions composition and doping impurity we obtained such InP-InAs, solid solution in which under radiation mutual compensation of optical absorption curves displacement occurs. Obtained result let create on the base of InP-InAs, solid solution radiation-resistant optical materials. Conclusion: It was established the nature of optical absorption near fundamental edge in semiconductor materials and it was created radiation-resistant optical material.

Keywords: InAs-InP, electrons concentration, irradiation, solid solutions

Procedia PDF Downloads 202
24446 Point Estimation for the Type II Generalized Logistic Distribution Based on Progressively Censored Data

Authors: Rana Rimawi, Ayman Baklizi

Abstract:

Skewed distributions are important models that are frequently used in applications. Generalized distributions form a class of skewed distributions and gain widespread use in applications because of their flexibility in data analysis. More specifically, the Generalized Logistic Distribution with its different types has received considerable attention recently. In this study, based on progressively type-II censored data, we will consider point estimation in type II Generalized Logistic Distribution (Type II GLD). We will develop several estimators for its unknown parameters, including maximum likelihood estimators (MLE), Bayes estimators and linear estimators (BLUE). The estimators will be compared using simulation based on the criteria of bias and Mean square error (MSE). An illustrative example of a real data set will be given.

Keywords: point estimation, type II generalized logistic distribution, progressive censoring, maximum likelihood estimation

Procedia PDF Downloads 201
24445 The Association between Gene Polymorphisms of GPX, SEPP1, and SEP15, Plasma Selenium Levels, Urinary Total Arsenic Concentrations, and Prostate Cancer

Authors: Yu-Mei Hsueh, Wei-Jen Chen, Yung-Kai Huang, Cheng-Shiuan Tsai, Kuo-Cheng Yeh

Abstract:

Prostate cancer occurs in men over the age of 50, and rank sixth of the top ten cancers in Taiwan, and the incidence increased gradually over the past decade in Taiwan. Arsenic is confirmed as a carcinogen by International Agency for Research on (IARC). Arsenic induces oxidative stress may be a risk factor for prostate cancer, but the mechanism is not clear. Selenium is an important antioxidant element. Whether the association between plasma selenium levels and risk of prostate cancer are modified by different genotype of selenoprotein is still unknown. Glutathione peroxidase, selenoprotein P (SEPP1) and 15 kDa selenoprotein (SEP 15) are selenoprotein and regulates selenium transport and the oxidation and reduction reaction. However, the association between gene polymorphisms of selenoprotein and prostate cancer is not yet clear. The aim of this study is to determine the relationship between plasma selenium, polymorphism of selenoprotein, urinary total arsenic concentration and prostate cancer. This study is a hospital-based case-control study. Three hundred twenty-two cases of prostate cancer and age (±5 years) 1:1 matched 322 control group were recruited from National Taiwan University Hospital, Taipei Medical University Hospital, and Wan Fang Hospital. Well-trained personnel carried out standardized personal interviews based on a structured questionnaire. Information collected included demographic and socioeconomic characteristics, lifestyle and disease history. Blood and urine samples were also collected at the same time. The Research Ethics Committee of National Taiwan University Hospital, Taipei, Taiwan, approved the study. All patients provided informed consent forms before sample and data collection. Buffy coat was to extract DNA, and the polymerase chain reaction - restriction fragment length polymorphism (PCR-RFLP) was used to measure the genotypes of SEPP1 rs3797310, SEP15 rs5859, GPX1 rs1050450, GPX2 rs4902346, GPX3 rs4958872, and GPX4 rs2075710. Plasma concentrations of selenium were determined by inductively coupled plasma mass spectrometry (ICP-MS).Urinary arsenic species concentrations were measured by high-performance liquid chromatography links hydride generator and atomic absorption spectrometer (HPLC-HG-AAS). Subject with high education level compared to those with low educational level had a lower prostate cancer odds ratio (OR) Mainland Chinese and aboriginal people had a lower OR of prostate cancer compared to Fukien Taiwanese. After adjustment for age, educational level, subjects with GPX1 rs1050450 CT and TT genotype compared to the CC genotype have lower, OR of prostate cancer, the OR and 95% confidence interval (Cl) was 0.53 (0.31-0.90). SEPP1 rs3797310 CT+TT genotype compared to those with CC genotype had a marginally significantly lower OR of PC. The low levels of plasma selenium and the high urinary total arsenic concentrations had the high OR of prostate cancer in a significant dose-response manner, and SEPP1 rs3797310 genotype modified this joint association.

Keywords: prostate cancer, plasma selenium concentration, urinary total arsenic concentrations, glutathione peroxidase, selenoprotein P, selenoprotein 15, gene polymorphism

Procedia PDF Downloads 268
24444 Identification of Toxic Metal Deposition in Food Cycle and Its Associated Public Health Risk

Authors: Masbubul Ishtiaque Ahmed

Abstract:

Food chain contamination by heavy metals has become a critical issue in recent years because of their potential accumulation in bio systems through contaminated water, soil and irrigation water. Industrial discharge, fertilizers, contaminated irrigation water, fossil fuels, sewage sludge and municipality wastes are the major sources of heavy metal contamination in soils and subsequent uptake by crops. The main objectives of this project were to determine the levels of minerals, trace elements and heavy metals in major foods and beverages consumed by the poor and non-poor households of Dhaka city and assess the dietary risk exposure to heavy metal and trace metal contamination and potential health implications as well as recommendations for action. Heavy metals are naturally occurring elements that have a high atomic weight and a density of at least 5 times greater than that of water. Their multiple industrial, domestic, agricultural, medical and technological applications have led to their wide distribution in the environment; raising concerns over their potential effects on human health and the environment. Their toxicity depends on several factors including the dose, route of exposure, and chemical species, as well as the age, gender, genetics, and nutritional status of exposed individuals. Because of their high degree of toxicity, arsenic, cadmium, chromium, lead, and mercury rank among the priority metals that are of public health significance. These metallic elements are considered systemic toxicants that are known to induce multiple organ damage, even at lower levels of exposure. This review provides an analysis of their environmental occurrence, production and use, potential for human exposure, and molecular mechanisms of toxicity, and carcinogenicity.

Keywords: food chain, determine the levels of minerals, trace elements, heavy metals, production and use, human exposure, toxicity, carcinogenicity

Procedia PDF Downloads 287
24443 Omni: Data Science Platform for Evaluate Performance of a LoRaWAN Network

Authors: Emanuele A. Solagna, Ricardo S, Tozetto, Roberto dos S. Rabello

Abstract:

Nowadays, physical processes are becoming digitized by the evolution of communication, sensing and storage technologies which promote the development of smart cities. The evolution of this technology has generated multiple challenges related to the generation of big data and the active participation of electronic devices in society. Thus, devices can send information that is captured and processed over large areas, but there is no guarantee that all the obtained data amount will be effectively stored and correctly persisted. Because, depending on the technology which is used, there are parameters that has huge influence on the full delivery of information. This article aims to characterize the project, currently under development, of a platform that based on data science will perform a performance and effectiveness evaluation of an industrial network that implements LoRaWAN technology considering its main parameters configuration relating these parameters to the information loss.

Keywords: Internet of Things, LoRa, LoRaWAN, smart cities

Procedia PDF Downloads 149
24442 Cybervetting and Online Privacy in Job Recruitment – Perspectives on the Current and Future Legislative Framework Within the EU

Authors: Nicole Christiansen, Hanne Marie Motzfeldt

Abstract:

In recent years, more and more HR professionals have been using cyber-vetting in job recruitment in an effort to find the perfect match for the company. These practices are growing rapidly, accessing a vast amount of data from social networks, some of which is privileged and protected information. Thus, there is a risk that the right to privacy is becoming a duty to manage your private data. This paper investigates to which degree a job applicant's fundamental rights are protected adequately in current and future legislation in the EU. This paper argues that current data protection regulations and forthcoming regulations on the use of AI ensure sufficient protection. However, even though the regulation on paper protects employees within the EU, the recruitment sector may not pay sufficient attention to the regulation as it not specifically targeting this area. Therefore, the lack of specific labor and employment regulation is a concern that the social partners should attend to.

Keywords: AI, cyber vetting, data protection, job recruitment, online privacy

Procedia PDF Downloads 88
24441 Sequential Pattern Mining from Data of Medical Record with Sequential Pattern Discovery Using Equivalent Classes (SPADE) Algorithm (A Case Study : Bolo Primary Health Care, Bima)

Authors: Rezky Rifaini, Raden Bagus Fajriya Hakim

Abstract:

This research was conducted at the Bolo primary health Care in Bima Regency. The purpose of the research is to find out the association pattern that is formed of medical record database from Bolo Primary health care’s patient. The data used is secondary data from medical records database PHC. Sequential pattern mining technique is the method that used to analysis. Transaction data generated from Patient_ID, Check_Date and diagnosis. Sequential Pattern Discovery Algorithms Using Equivalent Classes (SPADE) is one of the algorithm in sequential pattern mining, this algorithm find frequent sequences of data transaction, using vertical database and sequence join process. Results of the SPADE algorithm is frequent sequences that then used to form a rule. It technique is used to find the association pattern between items combination. Based on association rules sequential analysis with SPADE algorithm for minimum support 0,03 and minimum confidence 0,75 is gotten 3 association sequential pattern based on the sequence of patient_ID, check_Date and diagnosis data in the Bolo PHC.

Keywords: diagnosis, primary health care, medical record, data mining, sequential pattern mining, SPADE algorithm

Procedia PDF Downloads 402
24440 Estimation of Reservoirs Fracture Network Properties Using an Artificial Intelligence Technique

Authors: Reda Abdel Azim, Tariq Shehab

Abstract:

The main objective of this study is to develop a subsurface fracture map of naturally fractured reservoirs by overcoming the limitations associated with different data sources in characterising fracture properties. Some of these limitations are overcome by employing a nested neuro-stochastic technique to establish inter-relationship between different data, as conventional well logs, borehole images (FMI), core description, seismic attributes, and etc. and then characterise fracture properties in terms of fracture density and fractal dimension for each data source. Fracture density is an important property of a system of fracture network as it is a measure of the cumulative area of all the fractures in a unit volume of a fracture network system and Fractal dimension is also used to characterize self-similar objects such as fractures. At the wellbore locations, fracture density and fractal dimension can only be estimated for limited sections where FMI data are available. Therefore, artificial intelligence technique is applied to approximate the quantities at locations along the wellbore, where the hard data is not available. It should be noted that Artificial intelligence techniques have proven their effectiveness in this domain of applications.

Keywords: naturally fractured reservoirs, artificial intelligence, fracture intensity, fractal dimension

Procedia PDF Downloads 256
24439 Governance, Risk Management, and Compliance Factors Influencing the Adoption of Cloud Computing in Australia

Authors: Tim Nedyalkov

Abstract:

A business decision to move to the cloud brings fundamental changes in how an organization develops and delivers its Information Technology solutions. The accelerated pace of digital transformation across businesses and government agencies increases the reliance on cloud-based services. They are collecting, managing, and retaining large amounts of data in cloud environments makes information security and data privacy protection essential. It becomes even more important to understand what key factors drive successful cloud adoption following the commencement of the Privacy Amendment Notifiable Data Breaches (NDB) Act 2017 in Australia as the regulatory changes impact many organizations and industries. This quantitative correlational research investigated the governance, risk management, and compliance factors contributing to cloud security success. The factors influence the adoption of cloud computing within an organizational context after the commencement of the NDB scheme. The results and findings demonstrated that corporate information security policies, data storage location, management understanding of data governance responsibilities, and regular compliance assessments are the factors influencing cloud computing adoption. The research has implications for organizations, future researchers, practitioners, policymakers, and cloud computing providers to meet the rapidly changing regulatory and compliance requirements.

Keywords: cloud compliance, cloud security, data governance, privacy protection

Procedia PDF Downloads 117
24438 Simulations to Predict Solar Energy Potential by ERA5 Application at North Africa

Authors: U. Ali Rahoma, Nabil Esawy, Fawzia Ibrahim Moursy, A. H. Hassan, Samy A. Khalil, Ashraf S. Khamees

Abstract:

The design of any solar energy conversion system requires the knowledge of solar radiation data obtained over a long period. Satellite data has been widely used to estimate solar energy where no ground observation of solar radiation is available, yet there are limitations on the temporal coverage of satellite data. Reanalysis is a “retrospective analysis” of the atmosphere parameters generated by assimilating observation data from various sources, including ground observation, satellites, ships, and aircraft observation with the output of NWP (Numerical Weather Prediction) models, to develop an exhaustive record of weather and climate parameters. The evaluation of the performance of reanalysis datasets (ERA-5) for North Africa against high-quality surface measured data was performed using statistical analysis. The estimation of global solar radiation (GSR) distribution over six different selected locations in North Africa during ten years from the period time 2011 to 2020. The root means square error (RMSE), mean bias error (MBE) and mean absolute error (MAE) of reanalysis data of solar radiation range from 0.079 to 0.222, 0.0145 to 0.198, and 0.055 to 0.178, respectively. The seasonal statistical analysis was performed to study seasonal variation of performance of datasets, which reveals the significant variation of errors in different seasons—the performance of the dataset changes by changing the temporal resolution of the data used for comparison. The monthly mean values of data show better performance, but the accuracy of data is compromised. The solar radiation data of ERA-5 is used for preliminary solar resource assessment and power estimation. The correlation coefficient (R2) varies from 0.93 to 99% for the different selected sites in North Africa in the present research. The goal of this research is to give a good representation for global solar radiation to help in solar energy application in all fields, and this can be done by using gridded data from European Centre for Medium-Range Weather Forecasts ECMWF and producing a new model to give a good result.

Keywords: solar energy, solar radiation, ERA-5, potential energy

Procedia PDF Downloads 215
24437 Efficient Pre-Processing of Single-Cell Assay for Transposase Accessible Chromatin with High-Throughput Sequencing Data

Authors: Fan Gao, Lior Pachter

Abstract:

The primary tool currently used to pre-process 10X Chromium single-cell ATAC-seq data is Cell Ranger, which can take very long to run on standard datasets. To facilitate rapid pre-processing that enables reproducible workflows, we present a suite of tools called scATAK for pre-processing single-cell ATAC-seq data that is 15 to 18 times faster than Cell Ranger on mouse and human samples. Our tool can also calculate chromatin interaction potential matrices, and generate open chromatin signal and interaction traces for cell groups. We use scATAK tool to explore the chromatin regulatory landscape of a healthy adult human brain and unveil cell-type specific features, and show that it provides a convenient and computational efficient approach for pre-processing single-cell ATAC-seq data.

Keywords: single-cell, ATAC-seq, bioinformatics, open chromatin landscape, chromatin interactome

Procedia PDF Downloads 157
24436 Tripeptide Inhibitor: The Simplest Aminogenic PEGylated Drug against Amyloid Beta Peptide Fibrillation

Authors: Sutapa Som Chaudhury, Chitrangada Das Mukhopadhyay

Abstract:

Alzheimer’s disease is a well-known form of dementia since its discovery in 1906. Current Food and Drug Administration approved medications e.g. cholinesterase inhibitors, memantine offer modest symptomatic relief but do not play any role in disease modification or recovery. In last three decades many small molecules, chaperons, synthetic peptides, partial β-secretase enzyme blocker have been tested for the development of a drug against Alzheimer though did not pass the 3rd clinical phase trials. Here in this study, we designed a PEGylated, aminogenic, tripeptidic polymer with two different molecular weights based on the aggregation prone amino acid sequence 17-20 in amyloid beta (Aβ) 1-42. Being conjugated with poly-ethylene glycol (PEG) which self-assembles into hydrophilic nanoparticles, these PEGylated tripeptides constitute a very good drug delivery system crossing the blood brain barrier while the peptide remains protected from proteolytic degradation and non-specific protein interactions. Moreover, being completely aminogenic they would not raise any side effects. These peptide inhibitors were evaluated for their effectiveness against Aβ42 fibrillation at an early stage of oligomer to fibril formation as well as preformed fibril clearance via Thioflavin T (ThT) assay, dynamic light scattering analyses, atomic force microscopy and scanning electron microscopy. The inhibitors were proved to be safe at a higher concentration of 20µM by the reduction assay of 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) dye. Moreover, SHSY5Y neuroblastoma cells have shown a greater survivability when treated with the inhibitors following Aβ42 fibril and oligomer treatment as compared with the control Aβ42 fibril and/or oligomer treated neuroblastoma cells. These make the peptidic inhibitors a promising compound in the aspect of the discovery of alternative medication for Alzheimer’s disease.

Keywords: Alzheimer’s disease, alternative medication, amyloid beta, PEGylated peptide

Procedia PDF Downloads 209
24435 Heavy Metals and Carcinogenic Risk Assessment in Free-Ranged Livestock of Lead-Contaminated Goldmine Communities of Zamfara State, Northern Nigeria

Authors: Sulaiman Rabiu, Muazu Gusau Abubakar, Jafar Usman Zakari

Abstract:

The consumption of meat is of great importance as it provides a good source of proteins and significant amount of essential trace element to the body. However, contamination of meat and meat products with heavy metals is becoming a serious threat to food safety and public health. Therefore, the present study is aimed to evaluate the concentration of some heavy metals in muscles and entrails of free-ranged cattle, sheep and goats. A total of sixty (60) fresh samples of muscles, liver, kidney, small intestines and stomach of free ranged cattle, sheep and goats were collected from abattoirs of different goldmine communities of Anka, Bukkuyum, Maru andTalata-Mafara Local Government Areas of Zamfara State, Nigeria. The samples were digested using 10 mL of a mixed 70% high grade concentration of HNO₃ and 65% HCl (4:1 v/v); the mixture was heated until dense fumes disappeared forming a clear transparent solution and diluted to 50 mL with deionized water. Actual concentrations of Cd, Cr, Cu, Co, As, Ni, Mn, Pb and Zn were determined using Microwave Plasma Atomic Emission Spectrophotometer (MP-AES). From the results obtained, goat liver had the highest mean concentration of lead, arsenic, cobalt and manganese (12.43± 0.31, 14.25±0.32, 3.47± 0.86 and 12.68± 0.92 mg/kg respectively) while goat kidney had the highest concentration of copper and zinc (10.08±0.61 and 24.16±1.30 mg/kg respectively). The highest concentrations of cadmium and nickel were recorded in sheep kidney (7.75± 0.65 and 2.08±0.10 mg/kg respectively). Cattle muscles had the highest chromium concentration than all the organs analysed. The target hazard quotients (THQs) for all the metals were below 1.0, but TR which is a risk indices for carcinogenicity indicates an alarming result that requires stringent control to protect public health.Therefore, intensive public health awareness on the risk associated with contamination of heavy metals in meat should be advocated.

Keywords: contamination, goldmine, heavy metals, meat

Procedia PDF Downloads 113
24434 Meta Mask Correction for Nuclei Segmentation in Histopathological Image

Authors: Jiangbo Shi, Zeyu Gao, Chen Li

Abstract:

Nuclei segmentation is a fundamental task in digital pathology analysis and can be automated by deep learning-based methods. However, the development of such an automated method requires a large amount of data with precisely annotated masks which is hard to obtain. Training with weakly labeled data is a popular solution for reducing the workload of annotation. In this paper, we propose a novel meta-learning-based nuclei segmentation method which follows the label correction paradigm to leverage data with noisy masks. Specifically, we design a fully conventional meta-model that can correct noisy masks by using a small amount of clean meta-data. Then the corrected masks are used to supervise the training of the segmentation model. Meanwhile, a bi-level optimization method is adopted to alternately update the parameters of the main segmentation model and the meta-model. Extensive experimental results on two nuclear segmentation datasets show that our method achieves the state-of-the-art result. In particular, in some noise scenarios, it even exceeds the performance of training on supervised data.

Keywords: deep learning, histopathological image, meta-learning, nuclei segmentation, weak annotations

Procedia PDF Downloads 142
24433 Modification of Hexagonal Boron Nitride Induced by Focused Laser Beam

Authors: I. Wlasny, Z. Klusek, A. Wysmolek

Abstract:

Hexagonal boron nitride is a representative of a widely popular class of two-dimensional Van Der Waals materials. It finds its uses, among others, in construction of complexly layered heterostructures. Hexagonal boron nitride attracts great interest because of its properties characteristic for wide-gap semiconductors as well as an ultra-flat surface.Van Der Waals heterostructures composed of two-dimensional layered materials, such as transition metal dichalcogenides or graphene give hope for miniaturization of various electronic and optoelectronic elements. In our presentation, we will show the results of our investigations of the not previously reported modification of the hexagonal boron nitride layers with focused laser beam. The electrostatic force microscopy (EFM) images reveal that the irradiation leads to changes of the local electric fields for a wide range of laser wavelengths (from 442 to 785 nm). These changes are also accompanied by alterations of crystallographic structure of the material, as reflected by Raman spectra. They exhibit high stability and remain visible after at least five months. This behavior can be explained in terms of photoionization of the defect centers in h-BN which influence non-uniform electrostatic field screening by the photo-excited charge carriers. Analyzed changes influence local defect structure, and thus the interatomic distances within the lattice. These effects can be amplified by the piezoelectric character of hexagonal boron nitride, similar to that found in nitrides (e.g., GaN, AlN). Our results shed new light on the optical properties of the hexagonal boron nitride, in particular, those associated with electron-phonon coupling. Our study also opens new possibilities for h-BN applications in layered heterostructures where electrostatic fields can be used in tailoring of the local properties of the structures for use in micro- and nanoelectronics or field-controlled memory storage. This work is supported by National Science Centre project granted on the basis of the decision number DEC-2015/16/S/ST3/00451.

Keywords: atomic force microscopy, hexagonal boron nitride, optical properties, raman spectroscopy

Procedia PDF Downloads 175
24432 Feature Selection Approach for the Classification of Hydraulic Leakages in Hydraulic Final Inspection using Machine Learning

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Manufacturing companies are facing global competition and enormous cost pressure. The use of machine learning applications can help reduce production costs and create added value. Predictive quality enables the securing of product quality through data-supported predictions using machine learning models as a basis for decisions on test results. Furthermore, machine learning methods are able to process large amounts of data, deal with unfavourable row-column ratios and detect dependencies between the covariates and the given target as well as assess the multidimensional influence of all input variables on the target. Real production data are often subject to highly fluctuating boundary conditions and unbalanced data sets. Changes in production data manifest themselves in trends, systematic shifts, and seasonal effects. Thus, Machine learning applications require intensive pre-processing and feature selection. Data preprocessing includes rule-based data cleaning, the application of dimensionality reduction techniques, and the identification of comparable data subsets. Within the used real data set of Bosch hydraulic valves, the comparability of the same production conditions in the production of hydraulic valves within certain time periods can be identified by applying the concept drift method. Furthermore, a classification model is developed to evaluate the feature importance in different subsets within the identified time periods. By selecting comparable and stable features, the number of features used can be significantly reduced without a strong decrease in predictive power. The use of cross-process production data along the value chain of hydraulic valves is a promising approach to predict the quality characteristics of workpieces. In this research, the ada boosting classifier is used to predict the leakage of hydraulic valves based on geometric gauge blocks from machining, mating data from the assembly, and hydraulic measurement data from end-of-line testing. In addition, the most suitable methods are selected and accurate quality predictions are achieved.

Keywords: classification, achine learning, predictive quality, feature selection

Procedia PDF Downloads 163
24431 Zinc Oxide Varistor Performance: A 3D Network Model

Authors: Benjamin Kaufmann, Michael Hofstätter, Nadine Raidl, Peter Supancic

Abstract:

ZnO varistors are the leading overvoltage protection elements in today’s electronic industry. Their highly non-linear current-voltage characteristics, very fast response times, good reliability and attractive cost of production are unique in this field. There are challenges and questions unsolved. Especially, the urge to create even smaller, versatile and reliable parts, that fit industry’s demands, brings manufacturers to the limits of their abilities. Although, the varistor effect of sintered ZnO is known since the 1960’s, and a lot of work was done on this field to explain the sudden exponential increase of conductivity, the strict dependency on sinter parameters, as well as the influence of the complex microstructure, is not sufficiently understood. For further enhancement and down-scaling of varistors, a better understanding of the microscopic processes is needed. This work attempts a microscopic approach to investigate ZnO varistor performance. In order to cope with the polycrystalline varistor ceramic and in order to account for all possible current paths through the material, a preferably realistic model of the microstructure was set up in the form of three-dimensional networks where every grain has a constant electric potential, and voltage drop occurs only at the grain boundaries. The electro-thermal workload, depending on different grain size distributions, was investigated as well as the influence of the metal-semiconductor contact between the electrodes and the ZnO grains. A number of experimental methods are used, firstly, to feed the simulations with realistic parameters and, secondly, to verify the obtained results. These methods are: a micro 4-point probes method system (M4PPS) to investigate the current-voltage characteristics between single ZnO grains and between ZnO grains and the metal electrode inside the varistor, micro lock-in infrared thermography (MLIRT) to detect current paths, electron back scattering diffraction and piezoresponse force microscopy to determine grain orientations, atom probe to determine atomic substituents, Kelvin probe force microscopy for investigating grain surface potentials. The simulations showed that, within a critical voltage range, the current flow is localized along paths which represent only a tiny part of the available volume. This effect could be observed via MLIRT. Furthermore, the simulations exhibit that the electric power density, which is inversely proportional to the number of active current paths, since this number determines the electrical active volume, is dependent on the grain size distribution. M4PPS measurements showed that the electrode-grain contacts behave like Schottky diodes and are crucial for asymmetric current path development. Furthermore, evaluation of actual data suggests that current flow is influenced by grain orientations. The present results deepen the knowledge of influencing microscopic factors on ZnO varistor performance and can give some recommendations on fabrication for obtaining more reliable ZnO varistors.

Keywords: metal-semiconductor contact, Schottky diode, varistor, zinc oxide

Procedia PDF Downloads 283
24430 Secure Data Sharing of Electronic Health Records With Blockchain

Authors: Kenneth Harper

Abstract:

The secure sharing of Electronic Health Records (EHRs) is a critical challenge in modern healthcare, demanding solutions to enhance interoperability, privacy, and data integrity. Traditional standards like Health Information Exchange (HIE) and HL7 have made significant strides in facilitating data exchange between healthcare entities. However, these approaches rely on centralized architectures that are often vulnerable to data breaches, lack sufficient privacy measures, and have scalability issues. This paper proposes a framework for secure, decentralized sharing of EHRs using blockchain technology, cryptographic tokens, and Non-Fungible Tokens (NFTs). The blockchain's immutable ledger, decentralized control, and inherent security mechanisms are leveraged to improve transparency, accountability, and auditability in healthcare data exchanges. Furthermore, we introduce the concept of tokenizing patient data through NFTs, creating unique digital identifiers for each record, which allows for granular data access controls and proof of data ownership. These NFTs can also be employed to grant access to authorized parties, establishing a secure and transparent data sharing model that empowers both healthcare providers and patients. The proposed approach addresses common privacy concerns by employing privacy-preserving techniques such as zero-knowledge proofs (ZKPs) and homomorphic encryption to ensure that sensitive patient information can be shared without exposing the actual content of the data. This ensures compliance with regulations like HIPAA and GDPR. Additionally, the integration of Fast Healthcare Interoperability Resources (FHIR) with blockchain technology allows for enhanced interoperability, enabling healthcare organizations to exchange data seamlessly and securely across various systems while maintaining data governance and regulatory compliance. Through real-world case studies and simulations, this paper demonstrates how blockchain-based EHR sharing can reduce operational costs, improve patient outcomes, and enhance the security and privacy of healthcare data. This decentralized framework holds great potential for revolutionizing healthcare information exchange, providing a transparent, scalable, and secure method for managing patient data in a highly regulated environment.

Keywords: blockchain, electronic health records (ehrs), fast healthcare interoperability resources (fhir), health information exchange (hie), hl7, interoperability, non-fungible tokens (nfts), privacy-preserving techniques, tokens, secure data sharing,

Procedia PDF Downloads 23
24429 The Twin Terminal of Pedestrian Trajectory Based on City Intelligent Model (CIM) 4.0

Authors: Chen Xi, Liu Xuebing, Lao Xueru, Kuan Sinman, Jiang Yike, Wang Hanwei, Yang Xiaolang, Zhou Junjie, Xie Jinpeng

Abstract:

To further promote the development of smart cities, the microscopic "nerve endings" of the City Intelligent Model (CIM) are extended to be more sensitive. In this paper, we develop a pedestrian trajectory twin terminal based on the CIM and CNN technology. It also uses 5G networks, architectural and geoinformatics technologies, convolutional neural networks, combined with deep learning networks for human behavior recognition models, to provide empirical data such as 'pedestrian flow data and human behavioral characteristics data', and ultimately form spatial performance evaluation criteria and spatial performance warning systems, to make the empirical data accurate and intelligent for prediction and decision making.

Keywords: urban planning, urban governance, CIM, artificial intelligence, sustainable development

Procedia PDF Downloads 423
24428 Comparative Study of Antimicrobial, Antioxidant and Physicochemical Properties of Four Culinary Herbs Grown in Sri Lanka

Authors: Thilini Kananke

Abstract:

Culinary herbs have long been considered as significant dietary sources of many potential health-promoting compounds. The present research focused on analysis of antimicrobial, antioxidant and physicochemical properties in selected four culinary herbs namely Murraya koenigii (Curry leaves), Pandanus amaryllifolius (Pandan leaves), Cymbopogon citrates (Lemon grass leaves), and Mentha Piperita (Minchi leaves) obtained from several market sites in Ratnapura District, Sri Lanka. The antimicrobial activity of ethanolic, chloroform and distilled water extracts of culinary herbs were evaluated against the strains of Staphylococcus aureus, Salmonella typhi and Shigella spp. Total phenolic content and the radical scavenging activity (using DPPH assay) of culinary herbs were determined. Four heavy metals (Cu, Cd, Pb and Fe) were analyzed in the selected culinary herbs using the atomic absorption spectroscopy (AAS). Proximate compositions of the selected herbs were analyzed using AOAC official methods. Antimicrobial activity of all selected culinary herbs showed relativity high inhibition zones against S. aureus. Pandan leaves showed the least antimicrobial activity against selected bacterial strains compared with other culinary herbs. Both the highest radical scavenging activity (lower IC50 value) and the total phenolic content (25.57 ±3.54µg GAE/100g) were reported in Mentha piperita extract. The highest concentrations of Cu, Fe and Cd were reported in Curry leaves (29.15 mg/kg), Lemon grass leaves (257.98 mg/kg) and Pandan leaves (6.05 mg/kg) respectively. The heavy metal contents detected in all culinary herbs were below the permitted limits set by WHO/FAO, except Cd. The highest moisture (85.00±0.00%) and fiber (10.66± 2.00%) contents were found in Pandan leaves, while the highest protein (8.94±0.29%), fat (12.3± 2.52%) and ash (3.50± 0.17%) contents were reported in curry leaves. The information obtained from this study highlights the importance of further investigation of other antioxidant, antimicrobial and health promoting compounds of culinary herbs available in Sri Lanka for a detailed comparison.

Keywords: antimicrobial, antioxidant, culinary herbs, proximate analysis

Procedia PDF Downloads 181
24427 An Extended Inverse Pareto Distribution, with Applications

Authors: Abdel Hadi Ebraheim

Abstract:

This paper introduces a new extension of the Inverse Pareto distribution in the framework of Marshal-Olkin (1997) family of distributions. This model is capable of modeling various shapes of aging and failure data. The statistical properties of the new model are discussed. Several methods are used to estimate the parameters involved. Explicit expressions are derived for different types of moments of value in reliability analysis are obtained. Besides, the order statistics of samples from the new proposed model have been studied. Finally, the usefulness of the new model for modeling reliability data is illustrated using two real data sets with simulation study.

Keywords: pareto distribution, marshal-Olkin, reliability, hazard functions, moments, estimation

Procedia PDF Downloads 83
24426 Potential Determinants of Research Output: Comparing Economics and Business

Authors: Osiris Jorge Parcero, Néstor Gandelman, Flavia Roldán, Josef Montag

Abstract:

This paper uses cross-country unbalanced panel data of up to 146 countries over the period 1996 to 2015 to be the first study to identify potential determinants of a country’s relative research output in Economics versus Business. More generally, it is also one of the first studies comparing Economics and Business. The results show that better policy-related data availability, higher income inequality, and lower ethnic fractionalization relatively favor economics. The findings are robust to two alternative fixed effects specifications, three alternative definitions of economics and business, two alternative measures of research output (publications and citations), and the inclusion of meaningful control variables. To the best of our knowledge, our paper is also the first to demonstrate the importance of policy-related data as drivers of economic research. Our regressions show that the availability of this type of data is the single most important factor associated with the prevalence of economics over business as a research domain. Thus, our work has policy implications, as the availability of policy-related data is partially under policy control. Moreover, it has implications for students, professionals, universities, university departments, and research-funding agencies that face choices between profiles oriented toward economics and those oriented toward business. Finally, the conclusions show potential lines for further research.

Keywords: research output, publication performance, bibliometrics, economics, business, policy-related data

Procedia PDF Downloads 135
24425 Assessment of Routine Health Information System (RHIS) Quality Assurance Practices in Tarkwa Sub-Municipal Health Directorate, Ghana

Authors: Richard Okyere Boadu, Judith Obiri-Yeboah, Kwame Adu Okyere Boadu, Nathan Kumasenu Mensah, Grace Amoh-Agyei

Abstract:

Routine health information system (RHIS) quality assurance has become an important issue, not only because of its significance in promoting a high standard of patient care but also because of its impact on government budgets for the maintenance of health services. A routine health information system comprises healthcare data collection, compilation, storage, analysis, report generation, and dissemination on a routine basis in various healthcare settings. The data from RHIS give a representation of health status, health services, and health resources. The sources of RHIS data are normally individual health records, records of services delivered, and records of health resources. Using reliable information from routine health information systems is fundamental in the healthcare delivery system. Quality assurance practices are measures that are put in place to ensure the health data that are collected meet required quality standards. Routine health information system quality assurance practices ensure that data that are generated from the system are fit for use. This study considered quality assurance practices in the RHIS processes. Methods: A cross-sectional study was conducted in eight health facilities in Tarkwa Sub-Municipal Health Service in the western region of Ghana. The study involved routine quality assurance practices among the 90 health staff and management selected from facilities in Tarkwa Sub-Municipal who collected or used data routinely from 24th December 2019 to 20th January 2020. Results: Generally, Tarkwa Sub-Municipal health service appears to practice quality assurance during data collection, compilation, storage, analysis and dissemination. The results show some achievement in quality control performance in report dissemination (77.6%), data analysis (68.0%), data compilation (67.4%), report compilation (66.3%), data storage (66.3%) and collection (61.1%). Conclusions: Even though the Tarkwa Sub-Municipal Health Directorate engages in some control measures to ensure data quality, there is a need to strengthen the process to achieve the targeted percentage of performance (90.0%). There was a significant shortfall in quality assurance practices performance, especially during data collection, with respect to the expected performance.

Keywords: quality assurance practices, assessment of routine health information system quality, routine health information system, data quality

Procedia PDF Downloads 82
24424 Heart Failure Identification and Progression by Classifying Cardiac Patients

Authors: Muhammad Saqlain, Nazar Abbas Saqib, Muazzam A. Khan

Abstract:

Heart Failure (HF) has become the major health problem in our society. The prevalence of HF has increased as the patient’s ages and it is the major cause of the high mortality rate in adults. A successful identification and progression of HF can be helpful to reduce the individual and social burden from this syndrome. In this study, we use a real data set of cardiac patients to propose a classification model for the identification and progression of HF. The data set has divided into three age groups, namely young, adult, and old and then each age group have further classified into four classes according to patient’s current physical condition. Contemporary Data Mining classification algorithms have been applied to each individual class of every age group to identify the HF. Decision Tree (DT) gives the highest accuracy of 90% and outperform all other algorithms. Our model accurately diagnoses different stages of HF for each age group and it can be very useful for the early prediction of HF.

Keywords: decision tree, heart failure, data mining, classification model

Procedia PDF Downloads 402
24423 Critically Analyzing the Application of Big Data for Smart Transportation: A Case Study of Mumbai

Authors: Tanuj Joshi

Abstract:

Smart transportation is fast emerging as a solution to modern cities’ approach mobility issues, delayed emergency response rate and high congestion on streets. Present day scenario with Google Maps, Waze, Yelp etc. demonstrates how information and communications technologies controls the intelligent transportation system. This intangible and invisible infrastructure is largely guided by the big data analytics. On the other side, the exponential increase in Indian urban population has intensified the demand for better services and infrastructure to satisfy the transportation needs of its citizens. No doubt, India’s huge internet usage is looked as an important resource to guide to achieve this. However, with a projected number of over 40 billion objects connected to the Internet by 2025, the need for systems to handle massive volume of data (big data) also arises. This research paper attempts to identify the ways of exploiting the big data variables which will aid commuters on Indian tracks. This study explores real life inputs by conducting survey and interviews to identify which gaps need to be targeted to better satisfy the customers. Several experts at Mumbai Metropolitan Region Development Authority (MMRDA), Mumbai Metro and Brihanmumbai Electric Supply and Transport (BEST) were interviewed regarding the Information Technology (IT) systems currently in use. The interviews give relevant insights and requirements into the workings of public transportation systems whereas the survey investigates the macro situation.

Keywords: smart transportation, mobility issue, Mumbai transportation, big data, data analysis

Procedia PDF Downloads 179
24422 Scientific Linux Cluster for BIG-DATA Analysis (SLBD): A Case of Fayoum University

Authors: Hassan S. Hussein, Rania A. Abul Seoud, Amr M. Refaat

Abstract:

Scientific researchers face in the analysis of very large data sets that is increasing noticeable rate in today’s and tomorrow’s technologies. Hadoop and Spark are types of software that developed frameworks. Hadoop framework is suitable for many Different hardware platforms. In this research, a scientific Linux cluster for Big Data analysis (SLBD) is presented. SLBD runs open source software with large computational capacity and high performance cluster infrastructure. SLBD composed of one cluster contains identical, commodity-grade computers interconnected via a small LAN. SLBD consists of a fast switch and Gigabit-Ethernet card which connect four (nodes). Cloudera Manager is used to configure and manage an Apache Hadoop stack. Hadoop is a framework allows storing and processing big data across the cluster by using MapReduce algorithm. MapReduce algorithm divides the task into smaller tasks which to be assigned to the network nodes. Algorithm then collects the results and form the final result dataset. SLBD clustering system allows fast and efficient processing of large amount of data resulting from different applications. SLBD also provides high performance, high throughput, high availability, expandability and cluster scalability.

Keywords: big data platforms, cloudera manager, Hadoop, MapReduce

Procedia PDF Downloads 361
24421 Investigating the Effects of Data Transformations on a Bi-Dimensional Chi-Square Test

Authors: Alexandru George Vaduva, Adriana Vlad, Bogdan Badea

Abstract:

In this research, we conduct a Monte Carlo analysis on a two-dimensional χ2 test, which is used to determine the minimum distance required for independent sampling in the context of chaotic signals. We investigate the impact of transforming initial data sets from any probability distribution to new signals with a uniform distribution using the Spearman rank correlation on the χ2 test. This transformation removes the randomness of the data pairs, and as a result, the observed distribution of χ2 test values differs from the expected distribution. We propose a solution to this problem and evaluate it using another chaotic signal.

Keywords: chaotic signals, logistic map, Pearson’s test, Chi Square test, bivariate distribution, statistical independence

Procedia PDF Downloads 99