Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 846

Search results for: LiDAR datasets

246 Hysteresis Modeling in Iron-Dominated Magnets Based on a Deep Neural Network Approach

Authors: Maria Amodeo, Pasquale Arpaia, Marco Buzio, Vincenzo Di Capua, Francesco Donnarumma

Abstract:

Different deep neural network architectures have been compared and tested to predict magnetic hysteresis in the context of pulsed electromagnets for experimental physics applications. Modelling quasi-static or dynamic major and especially minor hysteresis loops is one of the most challenging topics for computational magnetism. Recent attempts at mathematical prediction in this context using Preisach models could not attain better than percent-level accuracy. Hence, this work explores neural network approaches and shows that the architecture that best fits the measured magnetic field behaviour, including the effects of hysteresis and eddy currents, is the nonlinear autoregressive exogenous neural network (NARX) model. This architecture aims to achieve a relative RMSE of the order of a few 100 ppm for complex magnetic field cycling, including arbitrary sequences of pseudo-random high field and low field cycles. The NARX-based architecture is compared with the state-of-the-art, showing better performance than the classical operator-based and differential models, and is tested on a reference quadrupole magnetic lens used for CERN particle beams, chosen as a case study. The training and test datasets are a representative example of real-world magnet operation; this makes the good result obtained very promising for future applications in this context.

Keywords: deep neural network, magnetic modelling, measurement and empirical software engineering, NARX

Procedia PDF Downloads 130

245 The Impact of Deprivation on the Prevalence of Common Mental Health Disorders in Clinical Commissioning Groups across England: A Retrospective, Cross-Sectional Study

Authors: Mohammed-Hareef Asunramu, Sana Hashemi, Raja Ohri, Luc Worthington, Nadia Zaman, Junkai Zhu

Abstract:

Background: The 2012 Health and Social Care Act committed to a ‘parity of esteem between mental and physical health services. Although this investment, aimed to both increase the quality of services and ensure the retention of mental health staff, questions remained regarding its ability to prevent mental health problems. One possible solution is a focus on the social determinants of health which have been shown to impact mental health. Aim: To examine the relationship between the index of multiple deprivations (IMD) and the prevalence of common mental health disorders (CMD) for CCGs in NHS England between 2019 and 2020. Design and setting: Cross-sectional analysis of 189 CCGs in NHS England. Methods: A multivariate linear regression model was utilized with CMD as outcome variable and IMD, age and ethnicity as explanatory variables. Datasets were obtained from Public Health England and the latest UK Census. Results: CCG IMD was found to have a significantly positive relationship with CMD. For every 1-point increase in IMD, CMD increases by 0.25%. Ethnicity had a significantly positive relationship with CMD. For every 1% increase in the population that identifies as BME, there is a 0.03% increase in CMD. Age had a significantly negative relationship with CMD. For every 1% increase in the population aged 60+, there is a 0.11% decrease in CMD. Conclusion: This study demonstrates that addressing mental health issues may require a multi-pronged approach. Beyond budget increases, it is essential to prioritize health equity, with careful considerations towards ethnic minorities and different age brackets.

Keywords: deprivation, health inequality, mental health, social determinants

Procedia PDF Downloads 127

244 Multi-Stream Graph Attention Network for Recommendation with Knowledge Graph

Authors: Zhifei Hu, Feng Xia

Abstract:

In recent years, Graph neural network has been widely used in knowledge graph recommendation. The existing recommendation methods based on graph neural network extract information from knowledge graph through entity and relation, which may not be efficient in the way of information extraction. In order to better propose useful entity information for the current recommendation task in the knowledge graph, we propose an end-to-end Neural network Model based on multi-stream graph attentional Mechanism (MSGAT), which can effectively integrate the knowledge graph into the recommendation system by evaluating the importance of entities from both users and items. Specifically, we use the attention mechanism from the user's perspective to distil the domain nodes information of the predicted item in the knowledge graph, to enhance the user's information on items, and generate the feature representation of the predicted item. Due to user history, click items can reflect the user's interest distribution, we propose a multi-stream attention mechanism, based on the user's preference for entities and relationships, and the similarity between items to be predicted and entities, aggregate user history click item's neighborhood entity information in the knowledge graph and generate the user's feature representation. We evaluate our model on three real recommendation datasets: Movielens-1M (ML-1M), LFM-1B 2015 (LFM-1B), and Amazon-Book (AZ-book). Experimental results show that compared with the most advanced models, our proposed model can better capture the entity information in the knowledge graph, which proves the validity and accuracy of the model.

Keywords: graph attention network, knowledge graph, recommendation, information propagation

Procedia PDF Downloads 116

243 INRAM-3DCNN: Multi-Scale Convolutional Neural Network Based on Residual and Attention Module Combined with Multilayer Perceptron for Hyperspectral Image Classification

Authors: Jianhong Xiang, Rui Sun, Linyu Wang

Abstract:

In recent years, due to the continuous improvement of deep learning theory, Convolutional Neural Network (CNN) has played a great superior performance in the research of Hyperspectral Image (HSI) classification. Since HSI has rich spatial-spectral information, only utilizing a single dimensional or single size convolutional kernel will limit the detailed feature information received by CNN, which limits the classification accuracy of HSI. In this paper, we design a multi-scale CNN with MLP based on residual and attention modules (INRAM-3DCNN) for the HSI classification task. We propose to use multiple 3D convolutional kernels to extract the packet feature information and fully learn the spatial-spectral features of HSI while designing residual 3D convolutional branches to avoid the decline of classification accuracy due to network degradation. Secondly, we also design the 2D Inception module with a joint channel attention mechanism to quickly extract key spatial feature information at different scales of HSI and reduce the complexity of the 3D model. Due to the high parallel processing capability and nonlinear global action of the Multilayer Perceptron (MLP), we use it in combination with the previous CNN structure for the final classification process. The experimental results on two HSI datasets show that the proposed INRAM-3DCNN method has superior classification performance and can perform the classification task excellently.

Keywords: INRAM-3DCNN, residual, channel attention, hyperspectral image classification

Procedia PDF Downloads 79

242 Classifier for Liver Ultrasound Images

Authors: Soumya Sajjan

Abstract:

Liver cancer is the most common cancer disease worldwide in men and women, and is one of the few cancers still on the rise. Liver disease is the 4th leading cause of death. According to new NHS (National Health Service) figures, deaths from liver diseases have reached record levels, rising by 25% in less than a decade; heavy drinking, obesity, and hepatitis are believed to be behind the rise. In this study, we focus on Development of Diagnostic Classifier for Ultrasound liver lesion. Ultrasound (US) Sonography is an easy-to-use and widely popular imaging modality because of its ability to visualize many human soft tissues/organs without any harmful effect. This paper will provide an overview of underlying concepts, along with algorithms for processing of liver ultrasound images Naturaly, Ultrasound liver lesion images are having more spackle noise. Developing classifier for ultrasound liver lesion image is a challenging task. We approach fully automatic machine learning system for developing this classifier. First, we segment the liver image by calculating the textural features from co-occurrence matrix and run length method. For classification, Support Vector Machine is used based on the risk bounds of statistical learning theory. The textural features for different features methods are given as input to the SVM individually. Performance analysis train and test datasets carried out separately using SVM Model. Whenever an ultrasonic liver lesion image is given to the SVM classifier system, the features are calculated, classified, as normal and diseased liver lesion. We hope the result will be helpful to the physician to identify the liver cancer in non-invasive method.

Keywords: segmentation, Support Vector Machine, ultrasound liver lesion, co-occurance Matrix

Procedia PDF Downloads 411

241 Image Ranking to Assist Object Labeling for Training Detection Models

Authors: Tonislav Ivanov, Oleksii Nedashkivskyi, Denis Babeshko, Vadim Pinskiy, Matthew Putman

Abstract:

Training a machine learning model for object detection that generalizes well is known to benefit from a training dataset with diverse examples. However, training datasets usually contain many repeats of common examples of a class and lack rarely seen examples. This is due to the process commonly used during human annotation where a person would proceed sequentially through a list of images labeling a sufficiently high total number of examples. Instead, the method presented involves an active process where, after the initial labeling of several images is completed, the next subset of images for labeling is selected by an algorithm. This process of algorithmic image selection and manual labeling continues in an iterative fashion. The algorithm used for the image selection is a deep learning algorithm, based on the U-shaped architecture, which quantifies the presence of unseen data in each image in order to find images that contain the most novel examples. Moreover, the location of the unseen data in each image is highlighted, aiding the labeler in spotting these examples. Experiments performed using semiconductor wafer data show that labeling a subset of the data, curated by this algorithm, resulted in a model with a better performance than a model produced from sequentially labeling the same amount of data. Also, similar performance is achieved compared to a model trained on exhaustive labeling of the whole dataset. Overall, the proposed approach results in a dataset that has a diverse set of examples per class as well as more balanced classes, which proves beneficial when training a deep learning model.

Keywords: computer vision, deep learning, object detection, semiconductor

Procedia PDF Downloads 136

240 Simulation of Climatic Change Effects on the Potential Fishing Zones of Dorado Fish (Coryphaena hippurus L.) in the Colombian Pacific under Scenarios RCP Using CMIP5 Model

Authors: Adriana Martínez-Arias, John Josephraj Selvaraj, Luis Octavio González-Salcedo

Abstract:

In the Colombian Pacific, Dorado fish (Coryphaena hippurus L.) fisheries is of great commercial interest. However, its habitat and fisheries may be affected by climatic change especially by the actual increase in sea surface temperature. Hence, it is of interest to study the dynamics of these species fishing zones. In this study, we developed Artificial Neural Networks (ANN) models to predict Catch per Unit Effort (CPUE) as an indicator of species abundance. The model was based on four oceanographic variables (Chlorophyll a, Sea Surface Temperature, Sea Level Anomaly and Bathymetry) derived from satellite data. CPUE datasets for model training and cross-validation were obtained from logbooks of commercial fishing vessel. Sea surface Temperature for Colombian Pacific were projected under Representative Concentration Pathway (RCP) scenarios 4.5 and 8.5 using Coupled Model Intercomparison Project Phase 5 (CMIP5) and CPUE maps were created. Our results indicated that an increase in sea surface temperature reduces the potential fishing zones of this species in the Colombian Pacific. We conclude that ANN is a reliable tool for simulation of climate change effects on the potential fishing zones. This research opens a future agenda for other species that have been affected by climate change.

Keywords: climatic change, artificial neural networks, dorado fish, CPUE

Procedia PDF Downloads 243

239 A Fast Community Detection Algorithm

Authors: Chung-Yuan Huang, Yu-Hsiang Fu, Chuen-Tsai Sun

Abstract:

Community detection represents an important data-mining tool for analyzing and understanding real-world complex network structures and functions. We believe that at least four criteria determine the appropriateness of a community detection algorithm: (a) it produces useable normalized mutual information (NMI) and modularity results for social networks, (b) it overcomes resolution limitation problems associated with synthetic networks, (c) it produces good NMI results and performance efficiency for Lancichinetti-Fortunato-Radicchi (LFR) benchmark networks, and (d) it produces good modularity and performance efficiency for large-scale real-world complex networks. To our knowledge, no existing community detection algorithm meets all four criteria. In this paper, we describe a simple hierarchical arc-merging (HAM) algorithm that uses network topologies and rule-based arc-merging strategies to identify community structures that satisfy the criteria. We used five well-studied social network datasets and eight sets of LFR benchmark networks to validate the ground-truth community correctness of HAM, eight large-scale real-world complex networks to measure its performance efficiency, and two synthetic networks to determine its susceptibility to resolution limitation problems. Our results indicate that the proposed HAM algorithm is capable of providing satisfactory performance efficiency and that HAM-identified communities were close to ground-truth communities in social and LFR benchmark networks while overcoming resolution limitation problems.

Keywords: complex network, social network, community detection, network hierarchy

Procedia PDF Downloads 227

238 Q-Map: Clinical Concept Mining from Clinical Documents

Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala

Abstract:

Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.

Keywords: information retrieval, unified medical language system, syntax based analysis, natural language processing, medical informatics

Procedia PDF Downloads 133

237 Infrastructure Change Monitoring Using Multitemporal Multispectral Satellite Images

Authors: U. Datta

Abstract:

The main objective of this study is to find a suitable approach to monitor the land infrastructure growth over a period of time using multispectral satellite images. Bi-temporal change detection method is unable to indicate the continuous change occurring over a long period of time. To achieve this objective, the approach used here estimates a statistical model from series of multispectral image data over a long period of time, assuming there is no considerable change during that time period and then compare it with the multispectral image data obtained at a later time. The change is estimated pixel-wise. Statistical composite hypothesis technique is used for estimating pixel based change detection in a defined region. The generalized likelihood ratio test (GLRT) is used to detect the changed pixel from probabilistic estimated model of the corresponding pixel. The changed pixel is detected assuming that the images have been co-registered prior to estimation. To minimize error due to co-registration, 8-neighborhood pixels around the pixel under test are also considered. The multispectral images from Sentinel-2 and Landsat-8 from 2015 to 2018 are used for this purpose. There are different challenges in this method. First and foremost challenge is to get quite a large number of datasets for multivariate distribution modelling. A large number of images are always discarded due to cloud coverage. Due to imperfect modelling there will be high probability of false alarm. Overall conclusion that can be drawn from this work is that the probabilistic method described in this paper has given some promising results, which need to be pursued further.

Keywords: co-registration, GLRT, infrastructure growth, multispectral, multitemporal, pixel-based change detection

Procedia PDF Downloads 134

236 Membrane-Localized Mutations as Predictors of Checkpoint Blockade Efficacy in Cancer

Authors: Zoe Goldberger, Priscilla S. Briquez, Jeffrey A. Hubbell

Abstract:

Tumor cells have mutations resulting from genetic instability that the immune system can actively recognize. Immune checkpoint immunotherapy (ICI) is commonly used in the clinic to re-activate immune reactions against mutated proteins, called neoantigens, resulting in tumor remission in cancer patients. However, only around 20% of patients show durable response to ICI. While tumor mutational burden (TMB) has been approved by the Food and Drug Administration (FDA) as a criterion for ICI therapy, the relevance of the subcellular localizations of the mutated proteins within the tumor cell has not been investigated. Here, we hypothesized that localization of mutations impacts the effect of immune responsiveness to ICI. We analyzed publicly available tumor mutation sequencing data of ICI treated patients from 3 independent datasets. We extracted the subcellular localization from the UniProtKB/Swiss-Prot database and quantified the proportion of membrane, cytoplasmic, nuclear, or secreted mutations per patient. We analyzed this information in relation to response to ICI treatment and overall survival of patients showing with 1722 ICI-treated patients that high mutational burden localized at the membrane (mTMB), correlate with ICI responsiveness, and improved overall survival in multiple cancer types. We anticipate that our results will ameliorate predictability of cancer patient response to ICI with potential implications in clinical guidelines to tailor ICI treatment. This would not only increase patient survival for those receiving ICI, but also patients’ quality of life by reducing the number of patients enduring non-effective ICI treatments.

Keywords: cancer, immunotherapy, membrane neoantigens, efficacy prediction, biomarkers

Procedia PDF Downloads 109

235 The Influence of Air Temperature Controls in Estimation of Air Temperature over Homogeneous Terrain

Authors: Fariza Yunus, Jasmee Jaafar, Zamalia Mahmud, Nurul Nisa’ Khairul Azmi, Nursalleh K. Chang, Nursalleh K. Chang

Abstract:

Variation of air temperature from one place to another is cause by air temperature controls. In general, the most important control of air temperature is elevation. Another significant independent variable in estimating air temperature is the location of meteorological stations. Distances to coastline and land use type are also contributed to significant variations in the air temperature. On the other hand, in homogeneous terrain direct interpolation of discrete points of air temperature work well to estimate air temperature values in un-sampled area. In this process the estimation is solely based on discrete points of air temperature. However, this study presents that air temperature controls also play significant roles in estimating air temperature over homogenous terrain of Peninsular Malaysia. An Inverse Distance Weighting (IDW) interpolation technique was adopted to generate continuous data of air temperature. This study compared two different datasets, observed mean monthly data of T, and estimation error of T–T’, where T’ estimated value from a multiple regression model. The multiple regression model considered eight independent variables of elevation, latitude, longitude, coastline, and four land use types of water bodies, forest, agriculture and build up areas, to represent the role of air temperature controls. Cross validation analysis was conducted to review accuracy of the estimation values. Final results show, estimation values of T–T’ produced lower errors for mean monthly mean air temperature over homogeneous terrain in Peninsular Malaysia.

Keywords: air temperature control, interpolation analysis, peninsular Malaysia, regression model, air temperature

Procedia PDF Downloads 374

234 An Enhanced Approach in Validating Analytical Methods Using Tolerance-Based Design of Experiments (DoE)

Authors: Gule Teri

Abstract:

The effective validation of analytical methods forms a crucial component of pharmaceutical manufacturing. However, traditional validation techniques can occasionally fail to fully account for inherent variations within datasets, which may result in inconsistent outcomes. This deficiency in validation accuracy is particularly noticeable when quantifying low concentrations of active pharmaceutical ingredients (APIs), excipients, or impurities, introducing a risk to the reliability of the results and, subsequently, the safety and effectiveness of the pharmaceutical products. In response to this challenge, we introduce an enhanced, tolerance-based Design of Experiments (DoE) approach for the validation of analytical methods. This approach distinctly measures variability with reference to tolerance or design margins, enhancing the precision and trustworthiness of the results. This method provides a systematic, statistically grounded validation technique that improves the truthfulness of results. It offers an essential tool for industry professionals aiming to guarantee the accuracy of their measurements, particularly for low-concentration components. By incorporating this innovative method, pharmaceutical manufacturers can substantially advance their validation processes, subsequently improving the overall quality and safety of their products. This paper delves deeper into the development, application, and advantages of this tolerance-based DoE approach and demonstrates its effectiveness using High-Performance Liquid Chromatography (HPLC) data for verification. This paper also discusses the potential implications and future applications of this method in enhancing pharmaceutical manufacturing practices and outcomes.

Keywords: tolerance-based design, design of experiments, analytical method validation, quality control, biopharmaceutical manufacturing

Procedia PDF Downloads 80

233 Learning Dynamic Representations of Nodes in Temporally Variant Graphs

Authors: Sandra Mitrovic, Gaurav Singh

Abstract:

In many industries, including telecommunications, churn prediction has been a topic of active research. A lot of attention has been drawn on devising the most informative features, and this area of research has gained even more focus with spread of (social) network analytics. The call detail records (CDRs) have been used to construct customer networks and extract potentially useful features. However, to the best of our knowledge, no studies including network features have yet proposed a generic way of representing network information. Instead, ad-hoc and dataset dependent solutions have been suggested. In this work, we build upon a recently presented method (node2vec) to obtain representations for nodes in observed network. The proposed approach is generic and applicable to any network and domain. Unlike node2vec, which assumes a static network, we consider a dynamic and time-evolving network. To account for this, we propose an approach that constructs the feature representation of each node by generating its node2vec representations at different timestamps, concatenating them and finally compressing using an auto-encoder-like method in order to retain reasonably long and informative feature vectors. We test the proposed method on churn prediction task in telco domain. To predict churners at timestamp ts+1, we construct training and testing datasets consisting of feature vectors from time intervals [t1, ts-1] and [t2, ts] respectively, and use traditional supervised classification models like SVM and Logistic Regression. Observed results show the effectiveness of proposed approach as compared to ad-hoc feature selection based approaches and static node2vec.

Keywords: churn prediction, dynamic networks, node2vec, auto-encoders

Procedia PDF Downloads 314

232 hsa-miR-1204 and hsa-miR-639 Prominent Role in Tamoxifen's Molecular Mechanisms on the EMT Phenomenon in Breast Cancer Patients

Authors: Mahsa Taghavi

Abstract:

In the treatment of breast cancer, tamoxifen is a regularly prescribed medication. The effect of tamoxifen on breast cancer patients' EMT pathways was studied. In this study to see if it had any effect on the cancer cells' resistance to tamoxifen and to look for specific miRNAs associated with EMT. In this work, we used continuous and integrated bioinformatics analysis to choose the optimal GEO datasets. Once we had sorted the gene expression profile, we looked at the mechanism of signaling, the ontology of genes, and the protein interaction of each gene. In the end, we used the GEPIA database to confirm the candidate genes. after that, I investigated critical miRNAs related to candidate genes. There were two gene expression profiles that were categorized into two distinct groups. Using the expression profile of genes that were lowered in the EMT pathway, the first group was examined. The second group represented the polar opposite of the first. A total of 253 genes from the first group and 302 genes from the second group were found to be common. Several genes in the first category were linked to cell death, focal adhesion, and cellular aging. Two genes in the second group were linked to cell death, focal adhesion, and cellular aging. distinct cell cycle stages were observed. Finally, proteins such as MYLK, SOCS3, and STAT5B from the first group and BIRC5, PLK1, and RAPGAP1 from the second group were selected as potential candidates linked to tamoxifen's influence on the EMT pathway. hsa-miR-1204 and hsa-miR-639 have a very close relationship with the candidates genes according to the node degrees and betweenness index. With this, the action of tamoxifen on the EMT pathway was better understood. It's important to learn more about how tamoxifen's target genes and proteins work so that we can better understand the drug.

Keywords: tamoxifen, breast cancer, bioinformatics analysis, EMT, miRNAs

Procedia PDF Downloads 129

231 Impact of Social Transfers on Energy Poverty in Turkey

Authors: Julide Yildirim, Nadir Ocal

Abstract:

Even though there are many studies investigating the extent and determinants of poverty, there is paucity of research investigating the issue of energy poverty in Turkey. The aim of this paper is threefold: First to investigate the extend of energy poverty in Turkey by using Household Budget Survey datasets belonging to 2005 - 2016 period. Second, to examine the risk factors for energy poverty. Finally, to assess the impact of social assistance program participation on energy poverty. Existing literature employs alternative methods to measure energy poverty. In this study energy poverty is measured by employing expenditure approach, where people are considered as energy poor if they disburse more than 10 per cent of their income to meet their energy requirements. Empirical results indicate that energy poverty rate is around 20 per cent during the time period under consideration. Since Household Budget Survey panel data is not available for 2005 - 2016 period, a pseudo panel has been constructed. Panel logistic regression method is utilized to determine the risk factors for energy poverty. The empirical results demonstrate that there is a statistically significant impact of work status and education level on energy poverty likelihood. In the final part of the paper the impact of social transfers on energy poverty has been examined by utilizing panel biprobit model, where social transfer participation and energy poverty incidences are jointly modeled. The empirical findings indicate that social transfer program participation reduces energy poverty. The negative association between energy poverty and social transfer program participation is more pronounced in urban areas compared with the rural areas.

Keywords: energy poverty, social transfers, panel data models, Turkey

Procedia PDF Downloads 141

230 Modeling of Sediment Yield and Streamflow of Watershed Basin in the Philippines Using the Soil Water Assessment Tool Model for Watershed Sustainability

Authors: Warda L. Panondi, Norihiro Izumi

Abstract:

Sedimentation is a significant threat to the sustainability of reservoirs and their watershed. In the Philippines, the Pulangi watershed experienced a high sediment loss mainly due to land conversions and plantations that showed critical erosion rates beyond the tolerable limit of -10 ton/ha/yr in all of its sub-basin. From this event, the prediction of runoff volume and sediment yield is essential to examine using the country's soil conservation techniques realistically. In this research, the Pulangi watershed was modeled using the soil water assessment tool (SWAT) to predict its watershed basin's annual runoff and sediment yield. For the calibration and validation of the model, the SWAT-CUP was utilized. The model was calibrated with monthly discharge data for 1990-1993 and validated for 1994-1997. Simultaneously, the sediment yield was calibrated in 2014 and validated in 2015 because of limited observed datasets. Uncertainty analysis and calculation of efficiency indexes were accomplished through the SUFI-2 algorithm. According to the coefficient of determination (R2), Nash Sutcliffe efficiency (NSE), King-Gupta efficiency (KGE), and PBIAS, the calculation of streamflow indicates a good performance for both calibration and validation periods while the sediment yield resulted in a satisfactory performance for both calibration and validation. Therefore, this study was able to identify the most critical sub-basin and severe needs of soil conservation. Furthermore, this study will provide baseline information to prevent floods and landslides and serve as a useful reference for land-use policies and watershed management and sustainability in the Pulangi watershed.

Keywords: Pulangi watershed, sediment yield, streamflow, SWAT model

Procedia PDF Downloads 209

229 FDX1, a Cuproptosis-Related Gene, Identified as a Potential Target for Human Ovarian Aging

Authors: Li-Te Lin, Chia-Jung Li, Kuan-Hao Tsui

Abstract:

Cuproptosis, a newly identified cell death mechanism, has attracted attention for its association with various diseases. However, the genetic interplay between cuproptosis and ovarian aging remains largely unexplored. This study aims to address this gap by analyzing datasets related to ovarian aging and cuproptosis. Spatial transcriptome analyses were conducted in the ovaries of both young and aged female mice to elucidate the role of FDX1. Comprehensive bioinformatics analyses, facilitated by R software, identified FDX1 as a potential cuproptosis-related gene with implications for ovarian aging. Clinical infertility biopsies were examined to validate these findings, showing consistent results in elderly infertile patients. Furthermore, pharmacogenomic analyses of ovarian cell lines explored the intricate association between FDX1 expression levels and sensitivity to specific small molecule drugs. Spatial transcriptome analyses revealed a significant reduction in FDX1 expression in aging ovaries, supported by consistent findings in biopsies from elderly infertile patients. Pharmacogenomic investigations indicated that modulating FDX1 could influence drug responses in ovarian-related therapies. This study pioneers the identification of FDX1 as a cuproptosis-related gene linked to ovarian aging. These findings not only contribute to understanding the mechanisms of ovarian aging but also position FDX1 as a potential diagnostic biomarker and therapeutic target. Further research may establish FDX1's pivotal role in advancing precision medicine and therapies for ovarian-related conditions.

Keywords: cuproptosis, FDX1, ovarian aging, biomarker

Procedia PDF Downloads 39

228 Efficacy of Deep Learning for Below-Canopy Reconstruction of Satellite and Aerial Sensing Point Clouds through Fractal Tree Symmetry

Authors: Dhanuj M. Gandikota

Abstract:

Sensor-derived three-dimensional (3D) point clouds of trees are invaluable in remote sensing analysis for the accurate measurement of key structural metrics, bio-inventory values, spatial planning/visualization, and ecological modeling. Machine learning (ML) holds the potential in addressing the restrictive tradeoffs in cost, spatial coverage, resolution, and information gain that exist in current point cloud sensing methods. Terrestrial laser scanning (TLS) remains the highest fidelity source of both canopy and below-canopy structural features, but usage is limited in both coverage and cost, requiring manual deployment to map out large, forested areas. While aerial laser scanning (ALS) remains a reliable avenue of LIDAR active remote sensing, ALS is also cost-restrictive in deployment methods. Space-borne photogrammetry from high-resolution satellite constellations is an avenue of passive remote sensing with promising viability in research for the accurate construction of vegetation 3-D point clouds. It provides both the lowest comparative cost and the largest spatial coverage across remote sensing methods. However, both space-borne photogrammetry and ALS demonstrate technical limitations in the capture of valuable below-canopy point cloud data. Looking to minimize these tradeoffs, we explored a class of powerful ML algorithms called Deep Learning (DL) that show promise in recent research on 3-D point cloud reconstruction and interpolation. Our research details the efficacy of applying these DL techniques to reconstruct accurate below-canopy point clouds from space-borne and aerial remote sensing through learned patterns of tree species fractal symmetry properties and the supplementation of locally sourced bio-inventory metrics. From our dataset, consisting of tree point clouds obtained from TLS, we deconstructed the point clouds of each tree into those that would be obtained through ALS and satellite photogrammetry of varying resolutions. We fed this ALS/satellite point cloud dataset, along with the simulated local bio-inventory metrics, into the DL point cloud reconstruction architectures to generate the full 3-D tree point clouds (the truth values are denoted by the full TLS tree point clouds containing the below-canopy information). Point cloud reconstruction accuracy was validated both through the measurement of error from the original TLS point clouds as well as the error of extraction of key structural metrics, such as crown base height, diameter above root crown, and leaf/wood volume. The results of this research additionally demonstrate the supplemental performance gain of using minimum locally sourced bio-inventory metric information as an input in ML systems to reach specified accuracy thresholds of tree point cloud reconstruction. This research provides insight into methods for the rapid, cost-effective, and accurate construction of below-canopy tree 3-D point clouds, as well as the supported potential of ML and DL to learn complex, unmodeled patterns of fractal tree growth symmetry.

Keywords: deep learning, machine learning, satellite, photogrammetry, aerial laser scanning, terrestrial laser scanning, point cloud, fractal symmetry

Procedia PDF Downloads 102

227 Philippine Site Suitability Analysis for Biomass, Hydro, Solar, and Wind Renewable Energy Development Using Geographic Information System Tools

Authors: Jara Kaye S. Villanueva, M. Rosario Concepcion O. Ang

Abstract:

For the past few years, Philippines has depended most of its energy source on oil, coal, and fossil fuel. According to the Department of Energy (DOE), the dominance of coal in the energy mix will continue until the year 2020. The expanding energy needs in the country have led to increasing efforts to promote and develop renewable energy. This research is a part of the government initiative in preparation for renewable energy development and expansion in the country. The Philippine Renewable Energy Resource Mapping from Light Detection and Ranging (LiDAR) Surveys is a three-year government project which aims to assess and quantify the renewable energy potential of the country and to put them into usable maps. This study focuses on the site suitability analysis of the four renewable energy sources – biomass (coconut, corn, rice, and sugarcane), hydro, solar, and wind energy. The site assessment is a key component in determining and assessing the most suitable locations for the construction of renewable energy power plants. This method maximizes the use of both the technical methods in resource assessment, as well as taking into account the environmental, social, and accessibility aspect in identifying potential sites by utilizing and integrating two different methods: the Multi-Criteria Decision Analysis (MCDA) method and Geographic Information System (GIS) tools. For the MCDA, Analytical Hierarchy Processing (AHP) is employed to determine the parameters needed for the suitability analysis. To structure these site suitability parameters, various experts from different fields were consulted – scientists, policy makers, environmentalists, and industrialists. The need to have a well-represented group of people to consult with is relevant to avoid bias in the output parameter of hierarchy levels and weight matrices. AHP pairwise matrix computation is utilized to derive weights per level out of the expert’s gathered feedback. Whereas from the threshold values derived from related literature, international studies, and government laws, the output values were then consulted with energy specialists from the DOE. Geospatial analysis using GIS tools translate this decision support outputs into visual maps. Particularly, this study uses Euclidean distance to compute for the distance values of each parameter, Fuzzy Membership algorithm which normalizes the output from the Euclidean Distance, and the Weighted Overlay tool for the aggregation of the layers. Using the Natural Breaks algorithm, the suitability ratings of each of the map are classified into 5 discrete categories of suitability index: (1) not suitable (2) least suitable, (3) suitable, (4) moderately suitable, and (5) highly suitable. In this method, the classes are grouped based on the best groups similar values wherein each subdivision are set from the rest based on the big difference in boundary values. Results show that in the entire Philippine area of responsibility, biomass has the highest suitability rating with rice as the most suitable at 75.76% suitability percentage, whereas wind has the least suitability percentage with score 10.28%. Solar and Hydro fall in the middle of the two, with suitability values 28.77% and 21.27%.

Keywords: site suitability, biomass energy, hydro energy, solar energy, wind energy, GIS

Procedia PDF Downloads 149

226 Trading off Accuracy for Speed in Powerdrill

Authors: Filip Buruiana, Alexander Hall, Reimar Hofmann, Thomas Hofmann, Silviu Ganceanu, Alexandru Tudorica

Abstract:

In-memory column-stores make interactive analysis feasible for many big data scenarios. PowerDrill is a system used internally at Google for exploration in logs data. Even though it is a highly parallelized column-store and uses in memory caching, interactive response times cannot be achieved for all datasets (note that it is common to analyze data with 50 billion records in PowerDrill). In this paper, we investigate two orthogonal approaches to optimize performance at the expense of an acceptable loss of accuracy. Both approaches can be implemented as outer wrappers around existing database engines and so they should be easily applicable to other systems. For the first optimization we show that memory is the limiting factor in executing queries at speed and therefore explore possibilities to improve memory efficiency. We adapt some of the theory behind data sketches to reduce the size of particularly expensive fields in our largest tables by a factor of 4.5 when compared to a standard compression algorithm. This saves 37% of the overall memory in PowerDrill and introduces a 0.4% relative error in the 90th percentile for results of queries with the expensive fields. We additionally evaluate the effects of using sampling on accuracy and propose a simple heuristic for annotating individual result-values as accurate (or not). Based on measurements of user behavior in our real production system, we show that these estimates are essential for interpreting intermediate results before final results are available. For a large set of queries this effectively brings down the 95th latency percentile from 30 to 4 seconds.

Keywords: big data, in-memory column-store, high-performance SQL queries, approximate SQL queries

Procedia PDF Downloads 259

225 Spatial and Geostatistical Analysis of Surficial Soils of the Contiguous United States

Authors: Rachel Hetherington, Chad Deering, Ann Maclean, Snehamoy Chatterjee

Abstract:

The U.S. Geological Survey conducted a soil survey and subsequent mineralogical and geochemical analyses of over 4800 samples taken across the contiguous United States between the years 2007 and 2013. At each location, samples were taken from the top 5 cm, the A-horizon, and the C-horizon. Many studies have looked at the correlation between the mineralogical and geochemical content of soils and influencing factors such as parent lithology, climate, soil type, and age, but it seems little has been done in relation to quantifying and assessing the correlation between elements in the soil on a national scale. GIS was used for the mapping and multivariate interpolation of over 40 major and trace elements for surficial soils (0-5 cm depth). Qualitative analysis of the spatial distribution across the U.S. shows distinct patterns amongst elements both within the same periodic groups and within different periodic groups, and therefore with different behavioural characteristics. Results show the emergence of 4 main patterns of high concentration areas: vertically along the west coast, a C-shape formed through the states around Utah and northern Arizona, a V-shape through the Midwest and connecting to the Appalachians, and along the Appalachians. The Band Collection Statistics tool in GIS was used to quantitatively analyse the geochemical raster datasets and calculate a correlation matrix. Patterns emerged, which were not identified in qualitative analysis, many of which are also amongst elements with very different characteristics. Preliminary results show 41 element pairings with a strong positive correlation ( ≥ 0.75). Both qualitative and quantitative analyses on this scale could increase knowledge on the relationships between element distribution and behaviour in surficial soils of the U.S.

Keywords: correlation matrix, geochemical analyses, spatial distribution of elements, surficial soils

Procedia PDF Downloads 126

224 A Survey of Skin Cancer Detection and Classification from Skin Lesion Images Using Deep Learning

Authors: Joseph George, Anne Kotteswara Roa

Abstract:

Skin disease is one of the most common and popular kinds of health issues faced by people nowadays. Skin cancer (SC) is one among them, and its detection relies on the skin biopsy outputs and the expertise of the doctors, but it consumes more time and some inaccurate results. At the early stage, skin cancer detection is a challenging task, and it easily spreads to the whole body and leads to an increase in the mortality rate. Skin cancer is curable when it is detected at an early stage. In order to classify correct and accurate skin cancer, the critical task is skin cancer identification and classification, and it is more based on the cancer disease features such as shape, size, color, symmetry and etc. More similar characteristics are present in many skin diseases; hence it makes it a challenging issue to select important features from a skin cancer dataset images. Hence, the skin cancer diagnostic accuracy is improved by requiring an automated skin cancer detection and classification framework; thereby, the human expert’s scarcity is handled. Recently, the deep learning techniques like Convolutional neural network (CNN), Deep belief neural network (DBN), Artificial neural network (ANN), Recurrent neural network (RNN), and Long and short term memory (LSTM) have been widely used for the identification and classification of skin cancers. This survey reviews different DL techniques for skin cancer identification and classification. The performance metrics such as precision, recall, accuracy, sensitivity, specificity, and F-measures are used to evaluate the effectiveness of SC identification using DL techniques. By using these DL techniques, the classification accuracy increases along with the mitigation of computational complexities and time consumption.

Keywords: skin cancer, deep learning, performance measures, accuracy, datasets

Procedia PDF Downloads 128

223 Unsupervised Echocardiogram View Detection via Autoencoder-Based Representation Learning

Authors: Andrea Treviño Gavito, Diego Klabjan, Sanjiv J. Shah

Abstract:

Echocardiograms serve as pivotal resources for clinicians in diagnosing cardiac conditions, offering non-invasive insights into a heart’s structure and function. When echocardiographic studies are conducted, no standardized labeling of the acquired views is performed. Employing machine learning algorithms for automated echocardiogram view detection has emerged as a promising solution to enhance efficiency in echocardiogram use for diagnosis. However, existing approaches predominantly rely on supervised learning, necessitating labor-intensive expert labeling. In this paper, we introduce a fully unsupervised echocardiographic view detection framework that leverages convolutional autoencoders to obtain lower dimensional representations and the K-means algorithm for clustering them into view-related groups. Our approach focuses on discriminative patches from echocardiographic frames. Additionally, we propose a trainable inverse average layer to optimize decoding of average operations. By integrating both public and proprietary datasets, we obtain a marked improvement in model performance when compared to utilizing a proprietary dataset alone. Our experiments show boosts of 15.5% in accuracy and 9.0% in the F-1 score for frame-based clustering, and 25.9% in accuracy and 19.8% in the F-1 score for view-based clustering. Our research highlights the potential of unsupervised learning methodologies and the utilization of open-sourced data in addressing the complexities of echocardiogram interpretation, paving the way for more accurate and efficient cardiac diagnoses.

Keywords: artificial intelligence, echocardiographic view detection, echocardiography, machine learning, self-supervised representation learning, unsupervised learning

Procedia PDF Downloads 32

222 Integrating Knowledge Distillation of Multiple Strategies

Authors: Min Jindong, Wang Mingxia

Abstract:

With the widespread use of artificial intelligence in life, computer vision, especially deep convolutional neural network models, has developed rapidly. With the increase of the complexity of the real visual target detection task and the improvement of the recognition accuracy, the target detection network model is also very large. The huge deep neural network model is not conducive to deployment on edge devices with limited resources, and the timeliness of network model inference is poor. In this paper, knowledge distillation is used to compress the huge and complex deep neural network model, and the knowledge contained in the complex network model is comprehensively transferred to another lightweight network model. Different from traditional knowledge distillation methods, we propose a novel knowledge distillation that incorporates multi-faceted features, called M-KD. In this paper, when training and optimizing the deep neural network model for target detection, the knowledge of the soft target output of the teacher network in knowledge distillation, the relationship between the layers of the teacher network and the feature attention map of the hidden layer of the teacher network are transferred to the student network as all knowledge. in the model. At the same time, we also introduce an intermediate transition layer, that is, an intermediate guidance layer, between the teacher network and the student network to make up for the huge difference between the teacher network and the student network. Finally, this paper adds an exploration module to the traditional knowledge distillation teacher-student network model. The student network model not only inherits the knowledge of the teacher network but also explores some new knowledge and characteristics. Comprehensive experiments in this paper using different distillation parameter configurations across multiple datasets and convolutional neural network models demonstrate that our proposed new network model achieves substantial improvements in speed and accuracy performance.

Keywords: object detection, knowledge distillation, convolutional network, model compression

Procedia PDF Downloads 278

221 Satellite Connectivity for Sustainable Mobility

Authors: Roberta Mugellesi Dow

Abstract:

As the climate crisis becomes unignorable, it is imperative that new services are developed addressing not only the needs of customers but also taking into account its impact on the environment. The Telecommunication and Integrated Application (TIA) Directorate of ESA is supporting the green transition with particular attention to the sustainable mobility.“Accelerating the shift to sustainable and smart mobility” is at the core of the European Green Deal strategy, which seeks a 90% reduction in related emissions by 2050 . Transforming the way that people and goods move is essential to increasing mobility while decreasing environmental impact, and transport must be considered holistically to produce a shared vision of green intermodal mobility. The use of space technologies, integrated with terrestrial technologies, is an enabler of smarter traffic management and increased transport efficiency for automated and connected multimodal mobility. Satellite connectivity, including future 5G networks, and digital technologies such as Digital Twin, AI, Machine Learning, and cloud-based applications are key enablers of sustainable mobility.SatCom is essential to ensure that connectivity is ubiquitously available, even in remote and rural areas, or in case of a failure, by the convergence of terrestrial and SatCom connectivity networks, This is especially crucial when there are risks of network failures or cyber-attacks targeting terrestrial communication. SatCom ensures communication network robustness and resilience. The combination of terrestrial and satellite communication networks is making possible intelligent and ubiquitous V2X systems and PNT services with significantly enhanced reliability and security, hyper-fast wireless access, as well as much seamless communication coverage. SatNav is essential in providing accurate tracking and tracing capabilities for automated vehicles and in guiding them to target locations. SatNav can also enable location-based services like car sharing applications, parking assistance, and fare payment. In addition to GNSS receivers, wireless connections, radar, lidar, and other installed sensors can enable automated vehicles to monitor surroundings, to ‘talk to each other’ and with infrastructure in real-time, and to respond to changes instantaneously. SatEO can be used to provide the maps required by the traffic management, as well as evaluate the conditions on the ground, assess changes and provide key data for monitoring and forecasting air pollution and other important parameters. Earth Observation derived data are used to provide meteorological information such as wind speed and direction, humidity, and others that must be considered into models contributing to traffic management services. The paper will provide examples of services and applications that have been developed aiming to identify innovative solutions and new business models that are allowed by new digital technologies engaging space and non space ecosystem together to deliver value and providing innovative, greener solutions in the mobility sector. Examples include Connected Autonomous Vehicles, electric vehicles, green logistics, and others. For the technologies relevant are the hybrid satcom and 5G providing ubiquitous coverage, IoT integration with non space technologies, as well as navigation, PNT technology, and other space data.

Keywords: sustainability, connectivity, mobility, satellites

Procedia PDF Downloads 133

220 Estimation Atmospheric parameters for Weather Study and Forecast over Equatorial Regions Using Ground-Based Global Position System

Authors: Asmamaw Yehun, Tsegaye Kassa, Addisu Hunegnaw, Martin Vermeer

Abstract:

There are various models to estimate the neutral atmospheric parameter values, such as in-suite and reanalysis datasets from numerical models. Accurate estimated values of the atmospheric parameters are useful for weather forecasting and, climate modeling and monitoring of climate change. Recently, Global Navigation Satellite System (GNSS) measurements have been applied for atmospheric sounding due to its robust data quality and wide horizontal and vertical coverage. The Global Positioning System (GPS) solutions that includes tropospheric parameters constitute a reliable set of data to be assimilated into climate models. The objective of this paper is, to estimate the neutral atmospheric parameters such as Wet Zenith Delay (WZD), Precipitable Water Vapour (PWV) and Total Zenith Delay (TZD) using six selected GPS stations in the equatorial regions, more precisely, the Ethiopian GPS stations from 2012 to 2015 observational data. Based on historic estimated GPS-derived values of PWV, we forecasted the PWV from 2015 to 2030. During data processing and analysis, we applied GAMIT-GLOBK software packages to estimate the atmospheric parameters. In the result, we found that the annual averaged minimum values of PWV are 9.72 mm for IISC and maximum 50.37 mm for BJCO stations. The annual averaged minimum values of WZD are 6 cm for IISC and maximum 31 cm for BDMT stations. In the long series of observations (from 2012 to 2015), we also found that there is a trend and cyclic patterns of WZD, PWV and TZD for all stations.

Keywords: atmosphere, GNSS, neutral atmosphere, precipitable water vapour

Procedia PDF Downloads 61

219 The Targeting Logic of Terrorist Groups in the Sahel

Authors: Mathieu Bere

Abstract:

Al-Qaeda and Islamic State-affiliated groups such as Ja’amat Nusra al Islam Wal Muslimim (JNIM) and the Islamic State-Greater Sahara Faction, which is now part of the Boko Haram splinter group, Islamic State in West Africa, were responsible, between 2018 and 2020, for at least 1.333 violent incidents against both military and civilian targets, including the assassination and kidnapping for ransom of Western citizens in Mali, Burkina Faso and Niger, the Central Sahel. Protecting civilians from the terrorist violence that is now spreading from the Sahel to the coastal countries of West Africa has been very challenging, mainly because of the many unknowns that surround the perpetrators. To contribute to a better protection of civilians in the region, this paper aims to shed light on the motivations and targeting logic of jihadist perpetrators of terrorist violence against civilians in the central Sahel region. To that end, it draws on relevant secondary data retrieved from datasets, the media, and the existing literature, but also on primary data collected through interviews and surveys in Burkina Faso. An analysis of the data with the support of qualitative and statistical analysis software shows that military and rational strategic motives, more than purely ideological or religious motives, have been the main drivers of terrorist violence that strategically targeted government symbols and representatives as well as local leaders in the central Sahel. Behind this targeting logic, the jihadist grand strategy emerges: wiping out the Western-inspired legal, education and governance system in order to replace it with an Islamic, sharia-based political, legal, and educational system.

Keywords: terrorism, jihadism, Sahel, targeting logic

Procedia PDF Downloads 85

218 Smart Mobility Planning Applications in Meeting the Needs of the Urbanization Growth

Authors: Caroline Atef Shoukry Tadros

Abstract:

Massive Urbanization growth threatens the sustainability of cities and the quality of city life. This raised the need for an alternate model of sustainability, so we need to plan the future cities in a smarter way with smarter mobility. Smart Mobility planning applications are solutions that use digital technologies and infrastructure advances to improve the efficiency, sustainability, and inclusiveness of urban transportation systems. They can contribute to meeting the needs of Urbanization growth by addressing the challenges of traffic congestion, pollution, accessibility, and safety in cities. Some example of a Smart Mobility planning application are Mobility-as-a-service: This is a service that integrates different transport modes, such as public transport, shared mobility, and active mobility, into a single platform that allows users to plan, book, and pay for their trips. This can reduce the reliance on private cars, optimize the use of existing infrastructure, and provide more choices and convenience for travelers. MaaS Global is a company that offers mobility-as-a-service solutions in several cities around the world. Traffic flow optimization: This is a solution that uses data analytics, artificial intelligence, and sensors to monitor and manage traffic conditions in real-time. This can reduce congestion, emissions, and travel time, as well as improve road safety and user satisfaction. Waycare is a platform that leverages data from various sources, such as connected vehicles, mobile applications, and road cameras, to provide traffic management agencies with insights and recommendations to optimize traffic flow. Logistics optimization: This is a solution that uses smart algorithms, blockchain, and IoT to improve the efficiency and transparency of the delivery of goods and services in urban areas. This can reduce the costs, emissions, and delays associated with logistics, as well as enhance the customer experience and trust. ShipChain is a blockchain-based platform that connects shippers, carriers, and customers and provides end-to-end visibility and traceability of the shipments. Autonomous vehicles: This is a solution that uses advanced sensors, software, and communication systems to enable vehicles to operate without human intervention. This can improve the safety, accessibility, and productivity of transportation, as well as reduce the need for parking space and infrastructure maintenance. Waymo is a company that develops and operates autonomous vehicles for various purposes, such as ride-hailing, delivery, and trucking. These are some of the ways that Smart Mobility planning applications can contribute to meeting the needs of the Urbanization growth. However, there are also various opportunities and challenges related to the implementation and adoption of these solutions, such as the regulatory, ethical, social, and technical aspects. Therefore, it is important to consider the specific context and needs of each city and its stakeholders when designing and deploying Smart Mobility planning applications.

Keywords: smart mobility planning, smart mobility applications, smart mobility techniques, smart mobility tools, smart transportation, smart cities, urbanization growth, future smart cities, intelligent cities, ICT information and communications technologies, IoT internet of things, sensors, lidar, digital twin, ai artificial intelligence, AR augmented reality, VR virtual reality, robotics, cps cyber physical systems, citizens design science

Procedia PDF Downloads 73

217 Bias-Corrected Estimation Methods for Receiver Operating Characteristic Surface

Authors: Khanh To Duc, Monica Chiogna, Gianfranco Adimari

Abstract:

With three diagnostic categories, assessment of the performance of diagnostic tests is achieved by the analysis of the receiver operating characteristic (ROC) surface, which generalizes the ROC curve for binary diagnostic outcomes. The volume under the ROC surface (VUS) is a summary index usually employed for measuring the overall diagnostic accuracy. When the true disease status can be exactly assessed by means of a gold standard (GS) test, unbiased nonparametric estimators of the ROC surface and VUS are easily obtained. In practice, unfortunately, disease status verification via the GS test could be unavailable for all study subjects, due to the expensiveness or invasiveness of the GS test. Thus, often only a subset of patients undergoes disease verification. Statistical evaluations of diagnostic accuracy based only on data from subjects with verified disease status are typically biased. This bias is known as verification bias. Here, we consider the problem of correcting for verification bias when continuous diagnostic tests for three-class disease status are considered. We assume that selection for disease verification does not depend on disease status, given test results and other observed covariates, i.e., we assume that the true disease status, when missing, is missing at random. Under this assumption, we discuss several solutions for ROC surface analysis based on imputation and re-weighting methods. In particular, verification bias-corrected estimators of the ROC surface and of VUS are proposed, namely, full imputation, mean score imputation, inverse probability weighting and semiparametric efficient estimators. Consistency and asymptotic normality of the proposed estimators are established, and their finite sample behavior is investigated by means of Monte Carlo simulation studies. Two illustrations using real datasets are also given.

Keywords: imputation, missing at random, inverse probability weighting, ROC surface analysis

Procedia PDF Downloads 416