Search results for: CART algorithm
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3548

Search results for: CART algorithm

8 Interpretable Deep Learning Models for Medical Condition Identification

Authors: Dongping Fang, Lian Duan, Xiaojing Yuan, Mike Xu, Allyn Klunder, Kevin Tan, Suiting Cao, Yeqing Ji

Abstract:

Accurate prediction of a medical condition with straight clinical evidence is a long-sought topic in the medical management and health insurance field. Although great progress has been made with machine learning algorithms, the medical community is still, to a certain degree, suspicious about the model's accuracy and interpretability. This paper presents an innovative hierarchical attention deep learning model to achieve good prediction and clear interpretability that can be easily understood by medical professionals. This deep learning model uses a hierarchical attention structure that matches naturally with the medical history data structure and reflects the member’s encounter (date of service) sequence. The model attention structure consists of 3 levels: (1) attention on the medical code types (diagnosis codes, procedure codes, lab test results, and prescription drugs), (2) attention on the sequential medical encounters within a type, (3) attention on the medical codes within an encounter and type. This model is applied to predict the occurrence of stage 3 chronic kidney disease (CKD3), using three years’ medical history of Medicare Advantage (MA) members from a top health insurance company. The model takes members’ medical events, both claims and electronic medical record (EMR) data, as input, makes a prediction of CKD3 and calculates the contribution from individual events to the predicted outcome. The model outcome can be easily explained with the clinical evidence identified by the model algorithm. Here are examples: Member A had 36 medical encounters in the past three years: multiple office visits, lab tests and medications. The model predicts member A has a high risk of CKD3 with the following well-contributed clinical events - multiple high ‘Creatinine in Serum or Plasma’ tests and multiple low kidneys functioning ‘Glomerular filtration rate’ tests. Among the abnormal lab tests, more recent results contributed more to the prediction. The model also indicates regular office visits, no abnormal findings of medical examinations, and taking proper medications decreased the CKD3 risk. Member B had 104 medical encounters in the past 3 years and was predicted to have a low risk of CKD3, because the model didn’t identify diagnoses, procedures, or medications related to kidney disease, and many lab test results, including ‘Glomerular filtration rate’ were within the normal range. The model accurately predicts members A and B and provides interpretable clinical evidence that is validated by clinicians. Without extra effort, the interpretation is generated directly from the model and presented together with the occurrence date. Our model uses the medical data in its most raw format without any further data aggregation, transformation, or mapping. This greatly simplifies the data preparation process, mitigates the chance for error and eliminates post-modeling work needed for traditional model explanation. To our knowledge, this is the first paper on an interpretable deep-learning model using a 3-level attention structure, sourcing both EMR and claim data, including all 4 types of medical data, on the entire Medicare population of a big insurance company, and more importantly, directly generating model interpretation to support user decision. In the future, we plan to enrich the model input by adding patients’ demographics and information from free-texted physician notes.

Keywords: deep learning, interpretability, attention, big data, medical conditions

Procedia PDF Downloads 61
7 Ensemble Sampler For Infinite-Dimensional Inverse Problems

Authors: Jeremie Coullon, Robert J. Webber

Abstract:

We introduce a Markov chain Monte Carlo (MCMC) sam-pler for infinite-dimensional inverse problems. Our sam-pler is based on the affine invariant ensemble sampler, which uses interacting walkers to adapt to the covariance structure of the target distribution. We extend this ensem-ble sampler for the first time to infinite-dimensional func-tion spaces, yielding a highly efficient gradient-free MCMC algorithm. Because our ensemble sampler does not require gradients or posterior covariance estimates, it is simple to implement and broadly applicable. In many Bayes-ian inverse problems, Markov chain Monte Carlo (MCMC) meth-ods are needed to approximate distributions on infinite-dimensional function spaces, for example, in groundwater flow, medical imaging, and traffic flow. Yet designing efficient MCMC methods for function spaces has proved challenging. Recent gradi-ent-based MCMC methods preconditioned MCMC methods, and SMC methods have improved the computational efficiency of functional random walk. However, these samplers require gradi-ents or posterior covariance estimates that may be challenging to obtain. Calculating gradients is difficult or impossible in many high-dimensional inverse problems involving a numerical integra-tor with a black-box code base. Additionally, accurately estimating posterior covariances can require a lengthy pilot run or adaptation period. These concerns raise the question: is there a functional sampler that outperforms functional random walk without requir-ing gradients or posterior covariance estimates? To address this question, we consider a gradient-free sampler that avoids explicit covariance estimation yet adapts naturally to the covariance struc-ture of the sampled distribution. This sampler works by consider-ing an ensemble of walkers and interpolating and extrapolating between walkers to make a proposal. This is called the affine in-variant ensemble sampler (AIES), which is easy to tune, easy to parallelize, and efficient at sampling spaces of moderate dimen-sionality (less than 20). The main contribution of this work is to propose a functional ensemble sampler (FES) that combines func-tional random walk and AIES. To apply this sampler, we first cal-culate the Karhunen–Loeve (KL) expansion for the Bayesian prior distribution, assumed to be Gaussian and trace-class. Then, we use AIES to sample the posterior distribution on the low-wavenumber KL components and use the functional random walk to sample the posterior distribution on the high-wavenumber KL components. Alternating between AIES and functional random walk updates, we obtain our functional ensemble sampler that is efficient and easy to use without requiring detailed knowledge of the target dis-tribution. In past work, several authors have proposed splitting the Bayesian posterior into low-wavenumber and high-wavenumber components and then applying enhanced sampling to the low-wavenumber components. Yet compared to these other samplers, FES is unique in its simplicity and broad applicability. FES does not require any derivatives, and the need for derivative-free sam-plers has previously been emphasized. FES also eliminates the requirement for posterior covariance estimates. Lastly, FES is more efficient than other gradient-free samplers in our tests. In two nu-merical examples, we apply FES to challenging inverse problems that involve estimating a functional parameter and one or more scalar parameters. We compare the performance of functional random walk, FES, and an alternative derivative-free sampler that explicitly estimates the posterior covariance matrix. We conclude that FES is the fastest available gradient-free sampler for these challenging and multimodal test problems.

Keywords: Bayesian inverse problems, Markov chain Monte Carlo, infinite-dimensional inverse problems, dimensionality reduction

Procedia PDF Downloads 126
6 Predicting Open Chromatin Regions in Cell-Free DNA Whole Genome Sequencing Data by Correlation Clustering  

Authors: Fahimeh Palizban, Farshad Noravesh, Amir Hossein Saeidian, Mahya Mehrmohamadi

Abstract:

In the recent decade, the emergence of liquid biopsy has significantly improved cancer monitoring and detection. Dying cells, including those originating from tumors, shed their DNA into the blood and contribute to a pool of circulating fragments called cell-free DNA. Accordingly, identifying the tissue origin of these DNA fragments from the plasma can result in more accurate and fast disease diagnosis and precise treatment protocols. Open chromatin regions are important epigenetic features of DNA that reflect cell types of origin. Profiling these features by DNase-seq, ATAC-seq, and histone ChIP-seq provides insights into tissue-specific and disease-specific regulatory mechanisms. There have been several studies in the area of cancer liquid biopsy that integrate distinct genomic and epigenomic features for early cancer detection along with tissue of origin detection. However, multimodal analysis requires several types of experiments to cover the genomic and epigenomic aspects of a single sample, which will lead to a huge amount of cost and time. To overcome these limitations, the idea of predicting OCRs from WGS is of particular importance. In this regard, we proposed a computational approach to target the prediction of open chromatin regions as an important epigenetic feature from cell-free DNA whole genome sequence data. To fulfill this objective, local sequencing depth will be fed to our proposed algorithm and the prediction of the most probable open chromatin regions from whole genome sequencing data can be carried out. Our method integrates the signal processing method with sequencing depth data and includes count normalization, Discrete Fourie Transform conversion, graph construction, graph cut optimization by linear programming, and clustering. To validate the proposed method, we compared the output of the clustering (open chromatin region+, open chromatin region-) with previously validated open chromatin regions related to human blood samples of the ATAC-DB database. The percentage of overlap between predicted open chromatin regions and the experimentally validated regions obtained by ATAC-seq in ATAC-DB is greater than 67%, which indicates meaningful prediction. As it is evident, OCRs are mostly located in the transcription start sites (TSS) of the genes. In this regard, we compared the concordance between the predicted OCRs and the human genes TSS regions obtained from refTSS and it showed proper accordance around 52.04% and ~78% with all and the housekeeping genes, respectively. Accurately detecting open chromatin regions from plasma cell-free DNA-seq data is a very challenging computational problem due to the existence of several confounding factors, such as technical and biological variations. Although this approach is in its infancy, there has already been an attempt to apply it, which leads to a tool named OCRDetector with some restrictions like the need for highly depth cfDNA WGS data, prior information about OCRs distribution, and considering multiple features. However, we implemented a graph signal clustering based on a single depth feature in an unsupervised learning manner that resulted in faster performance and decent accuracy. Overall, we tried to investigate the epigenomic pattern of a cell-free DNA sample from a new computational perspective that can be used along with other tools to investigate genetic and epigenetic aspects of a single whole genome sequencing data for efficient liquid biopsy-related analysis.

Keywords: open chromatin regions, cancer, cell-free DNA, epigenomics, graph signal processing, correlation clustering

Procedia PDF Downloads 109
5 EcoTeka, an Open-Source Software for Urban Ecosystem Restoration through Technology

Authors: Manon Frédout, Laëtitia Bucari, Mathias Aloui, Gaëtan Duhamel, Olivier Rovellotti, Javier Blanco

Abstract:

Ecosystems must be resilient to ensure cleaner air, better water and soil quality, and thus healthier citizens. Technology can be an excellent tool to support urban ecosystem restoration projects, especially when based on Open Source and promoting Open Data. This is the goal of the ecoTeka application: one single digital tool for tree management which allows decision-makers to improve their urban forestry practices, enabling more responsible urban planning and climate change adaptation. EcoTeka provides city councils with three main functionalities tackling three of their challenges: easier biodiversity inventories, better green space management, and more efficient planning. To answer the cities’ need for reliable tree inventories, the application has been first built with open data coming from the websites OpenStreetMap and OpenTrees, but it will also include very soon the possibility of creating new data. To achieve this, a multi-source algorithm will be elaborated, based on existing artificial intelligence Deep Forest, integrating open-source satellite images, 3D representations from LiDAR, and street views from Mapillary. This data processing will permit identifying individual trees' position, height, crown diameter, and taxonomic genus. To support urban forestry management, ecoTeka offers a dashboard for monitoring the city’s tree inventory and trigger alerts to inform about upcoming due interventions. This tool was co-constructed with the green space departments of the French cities of Alès, Marseille, and Rouen. The third functionality of the application is a decision-making tool for urban planning, promoting biodiversity and landscape connectivity metrics to drive ecosystem restoration roadmap. Based on landscape graph theory, we are currently experimenting with new methodological approaches to scale down regional ecological connectivity principles to local biodiversity conservation and urban planning policies. This methodological framework will couple graph theoretic approach and biological data, mainly biodiversity occurrences (presence/absence) data available on both international (e.g., GBIF), national (e.g., Système d’Information Nature et Paysage) and local (e.g., Atlas de la Biodiversté Communale) biodiversity data sharing platforms in order to help reasoning new decisions for ecological networks conservation and restoration in urban areas. An experiment on this subject is currently ongoing with Montpellier Mediterranee Metropole. These projects and studies have shown that only 26% of tree inventory data is currently geo-localized in France - the rest is still being done on paper or Excel sheets. It seems that technology is not yet used enough to enrich the knowledge city councils have about biodiversity in their city and that existing biodiversity open data (e.g., occurrences, telemetry, or genetic data), species distribution models, landscape graph connectivity metrics are still underexploited to make rational decisions for landscape and urban planning projects. This is the goal of ecoTeka: to support easier inventories of urban biodiversity and better management of urban spaces through rational planning and decisions relying on open databases. Future studies and projects will focus on the development of tools for reducing the artificialization of soils, selecting plant species adapted to climate change, and highlighting the need for ecosystem and biodiversity services in cities.

Keywords: digital software, ecological design of urban landscapes, sustainable urban development, urban ecological corridor, urban forestry, urban planning

Procedia PDF Downloads 33
4 Synthetic Method of Contextual Knowledge Extraction

Authors: Olga Kononova, Sergey Lyapin

Abstract:

Global information society requirements are transparency and reliability of data, as well as ability to manage information resources independently; particularly to search, to analyze, to evaluate information, thereby obtaining new expertise. Moreover, it is satisfying the society information needs that increases the efficiency of the enterprise management and public administration. The study of structurally organized thematic and semantic contexts of different types, automatically extracted from unstructured data, is one of the important tasks for the application of information technologies in education, science, culture, governance and business. The objectives of this study are the contextual knowledge typologization, selection or creation of effective tools for extracting and analyzing contextual knowledge. Explication of various kinds and forms of the contextual knowledge involves the development and use full-text search information systems. For the implementation purposes, the authors use an e-library 'Humanitariana' services such as the contextual search, different types of queries (paragraph-oriented query, frequency-ranked query), automatic extraction of knowledge from the scientific texts. The multifunctional e-library «Humanitariana» is realized in the Internet-architecture in WWS-configuration (Web-browser / Web-server / SQL-server). Advantage of use 'Humanitariana' is in the possibility of combining the resources of several organizations. Scholars and research groups may work in a local network mode and in distributed IT environments with ability to appeal to resources of any participating organizations servers. Paper discusses some specific cases of the contextual knowledge explication with the use of the e-library services and focuses on possibilities of new types of the contextual knowledge. Experimental research base are science texts about 'e-government' and 'computer games'. An analysis of the subject-themed texts trends allowed to propose the content analysis methodology, that combines a full-text search with automatic construction of 'terminogramma' and expert analysis of the selected contexts. 'Terminogramma' is made out as a table that contains a column with a frequency-ranked list of words (nouns), as well as columns with an indication of the absolute frequency (number) and the relative frequency of occurrence of the word (in %% ppm). The analysis of 'e-government' materials showed, that the state takes a dominant position in the processes of the electronic interaction between the authorities and society in modern Russia. The media credited the main role in these processes to the government, which provided public services through specialized portals. Factor analysis revealed two factors statistically describing the used terms: human interaction (the user) and the state (government, processes organizer); interaction management (public officer, processes performer) and technology (infrastructure). Isolation of these factors will lead to changes in the model of electronic interaction between government and society. In this study, the dominant social problems and the prevalence of different categories of subjects of computer gaming in science papers from 2005 to 2015 were identified. Therefore, there is an evident identification of several types of contextual knowledge: micro context; macro context; dynamic context; thematic collection of queries (interactive contextual knowledge expanding a composition of e-library information resources); multimodal context (functional integration of iconographic and full-text resources through hybrid quasi-semantic algorithm of search). Further studies can be pursued both in terms of expanding the resource base on which they are held, and in terms of the development of appropriate tools.

Keywords: contextual knowledge, contextual search, e-library services, frequency-ranked query, paragraph-oriented query, technologies of the contextual knowledge extraction

Procedia PDF Downloads 324
3 Cycleloop Personal Rapid Transit: An Exploratory Study for Last Mile Connectivity in Urban Transport

Authors: Suresh Salla

Abstract:

In this paper, author explores for most sustainable last mile transport mode addressing present problems of traffic congestion, jams, pollution and travel stress. Development of energy-efficient sustainable integrated transport system(s) is/are must to make our cities more livable. Emphasis on autonomous, connected, electric, sharing system for effective utilization of systems (vehicles and public infrastructure) is on the rise. Many surface mobility innovations like PBS, Ride hailing, ride sharing, etc. are, although workable but if we analyze holistically, add to the already congested roads, difficult to ride in hostile weather, causes pollution and poses commuter stress. Sustainability of transportation is evaluated with respect to public adoption, average speed, energy consumption, and pollution. Why public prefer certain mode over others? How commute time plays a role in mode selection or shift? What are the factors play-ing role in energy consumption and pollution? Based on the study, it is clear that public prefer a transport mode which is exhaustive (i.e., less need for interchange – network is widespread) and intensive (i.e., less waiting time - vehicles are available at frequent intervals) and convenient with latest technologies. Average speed is dependent on stops, number of intersections, signals, clear route availability, etc. It is clear from Physics that higher the kerb weight of a vehicle; higher is the operational energy consumption. Higher kerb weight also demands heavier infrastructure. Pollution is dependent on source of energy, efficiency of vehicle, average speed. Mode can be made exhaustive when the unit infrastructure cost is less and can be offered intensively when the vehicle cost is less. Reliable and seamless integrated mobility till last ¼ mile (Five Minute Walk-FMW) is a must to encourage sustainable public transportation. Study shows that average speed and reliability of dedicated modes (like Metro, PRT, BRT, etc.) is high compared to road vehicles. Electric vehicles and more so battery-less or 3rd rail vehicles reduce pollution. One potential mode can be Cycleloop PRT, where commuter rides e-cycle in a dedicated path – elevated, at grade or underground. e-Bike with kerb weight per rider at 15 kg being 1/50th of car or 1/10th of other PRT systems makes it sustainable mode. Cycleloop tube will be light, sleek and scalable and can be modular erected, either on modified street lamp-posts or can be hanged/suspended between the two stations. Embarking and dis-embarking points or offline stations can be at an interval which suits FMW to mass public transit. In terms of convenience, guided e-Bike can be made self-balancing thus encouraging driverless on-demand vehicles. e-Bike equipped with smart electronics and drive controls can intelligently respond to field sensors and autonomously move reacting to Central Controller. Smart switching allows travel from origin to destination without interchange of cycles. DC Powered Batteryless e-cycle with voluntary manual pedaling makes it sustainable and provides health benefits. Tandem e-bike, smart switching and Platoon operations algorithm options provide superior through-put of the Cycleloop. Thus Cycleloop PRT will be exhaustive, intensive, convenient, reliable, speedy, sustainable, safe, pollution-free and healthy alternative mode for last mile connectivity in cities.

Keywords: cycleloop PRT, five-minute walk, lean modular infrastructure, self-balanced intelligent e-cycle

Procedia PDF Downloads 108
2 Microfabrication and Non-Invasive Imaging of Porous Osteogenic Structures Using Laser-Assisted Technologies

Authors: Irina Alexandra Paun, Mona Mihailescu, Marian Zamfirescu, Catalin Romeo Luculescu, Adriana Maria Acasandrei, Cosmin Catalin Mustaciosu, Roxana Cristina Popescu, Maria Dinescu

Abstract:

A major concern in bone tissue engineering is to develop complex 3D architectures that mimic the natural cells environment, facilitate the cells growth in a defined manner and allow the flow transport of nutrients and metabolic waste. In particular, porous structures of controlled pore size and positioning are indispensable for growing human-like bone structures. Another concern is to monitor both the structures and the seeded cells with high spatial resolution and without interfering with the cells natural environment. The present approach relies on laser-based technologies employed for fabricating porous biomimetic structures that support the growth of osteoblast-like cells and for their non-invasive 3D imaging. Specifically, the porous structures were built by two photon polymerization –direct writing (2PP_DW) of the commercially available photoresists IL-L780, using the Photonic Professional 3D lithography system. The structures consist of vertical tubes with micrometer-sized heights and diameters, in a honeycomb-like spatial arrangement. These were fabricated by irradiating the IP-L780 photoresist with focused laser pulses with wavelength centered at 780 nm, 120 fs pulse duration and 80 MHz repetition rate. The samples were precisely scanned in 3D by piezo stages. The coarse positioning was done by XY motorized stages. The scanning path was programmed through a writing language (GWL) script developed by Nanoscribe. Following laser irradiation, the unexposed regions of the photoresist were washed out by immersing the samples in the Propylene Glycol Monomethyl Ether Acetate (PGMEA). The porous structures were seeded with osteoblast like MG-63 cells and their osteogenic potential was tested in vitro. The cell-seeded structures were analyzed in 3D using the digital holographic microscopy technique (DHM). DHM is a marker free and high spatial resolution imaging tool, where the hologram acquisition is performed non-invasively i.e. without interfering with the cells natural environment. Following hologram recording, a digital algorithm provided a 3D image of the sample, as well as information about its refractive index, which is correlated with the intracellular content. The axial resolution of the images went down to the nanoscale, while the temporal scales ranged from milliseconds up to hours. The hologram did not involve sample scanning and the whole image was available in one frame recorded going over 200μm field of view. The digital holograms processing provided 3D quantitative information on the porous structures and allowed a quantitative analysis of the cellular response in respect to the porous architectures. The cellular shape and dimensions were found to be influenced by the underlying micro relief. Furthermore, the intracellular content gave evidence on the beneficial role of the porous structures in promoting osteoblast differentiation. In all, the proposed laser-based protocol emerges as a promising tool for the fabrication and non-invasive imaging of porous constructs for bone tissue engineering. Acknowledgments: This work was supported by a grant of the Romanian Authority for Scientific Research and Innovation, CNCS-UEFISCDI, project PN-II-RU-TE-2014-4-2534 (contract 97 from 01/10/2015) and by UEFISCDI PN-II-PT-PCCA no. 6/2012. A part of this work was performed in the CETAL laser facility, supported by the National Program PN 16 47 - LAPLAS IV.

Keywords: biomimetic, holography, laser, osteoblast, two photon polymerization

Procedia PDF Downloads 245
1 Computational Fluid Dynamics Simulation of a Nanofluid-Based Annular Solar Collector with Different Metallic Nano-Particles

Authors: Sireetorn Kuharat, Anwar Beg

Abstract:

Motivation- Solar energy constitutes the most promising renewable energy source on earth. Nanofluids are a very successful family of engineered fluids, which contain well-dispersed nanoparticles suspended in a stable base fluid. The presence of metallic nanoparticles (e.g. gold, silver, copper, aluminum etc) significantly improves the thermo-physical properties of the host fluid and generally results in a considerable boost in thermal conductivity, density, and viscosity of nanofluid compared with the original base (host) fluid. This modification in fundamental thermal properties has profound implications in influencing the convective heat transfer process in solar collectors. The potential for improving solar collector direct absorber efficiency is immense and to gain a deeper insight into the impact of different metallic nanoparticles on efficiency and temperature enhancement, in the present work, we describe recent computational fluid dynamics simulations of an annular solar collector system. The present work studies several different metallic nano-particles and compares their performance. Methodologies- A numerical study of convective heat transfer in an annular pipe solar collector system is conducted. The inner tube contains pure water and the annular region contains nanofluid. Three-dimensional steady-state incompressible laminar flow comprising water- (and other) based nanofluid containing a variety of metallic nanoparticles (copper oxide, aluminum oxide, and titanium oxide nanoparticles) is examined. The Tiwari-Das model is deployed for which thermal conductivity, specific heat capacity and viscosity of the nanofluid suspensions is evaluated as a function of solid nano-particle volume fraction. Radiative heat transfer is also incorporated using the ANSYS solar flux and Rosseland radiative models. The ANSYS FLUENT finite volume code (version 18.1) is employed to simulate the thermo-fluid characteristics via the SIMPLE algorithm. Mesh-independence tests are conducted. Validation of the simulations is also performed with a computational Harlow-Welch MAC (Marker and Cell) finite difference method and excellent correlation achieved. The influence of volume fraction on temperature, velocity, pressure contours is computed and visualized. Main findings- The best overall performance is achieved with copper oxide nanoparticles. Thermal enhancement is generally maximized when water is utilized as the base fluid, although in certain cases ethylene glycol also performs very efficiently. Increasing nanoparticle solid volume fraction elevates temperatures although the effects are less prominent in aluminum and titanium oxide nanofluids. Significant improvement in temperature distributions is achieved with copper oxide nanofluid and this is attributed to the superior thermal conductivity of copper compared to other metallic nano-particles studied. Important fluid dynamic characteristics are also visualized including circulation and temperature shoots near the upper region of the annulus. Radiative flux is observed to enhance temperatures significantly via energization of the nanofluid although again the best elevation in performance is attained consistently with copper oxide. Conclusions-The current study generalizes previous investigations by considering multiple metallic nano-particles and furthermore provides a good benchmark against which to calibrate experimental tests on a new solar collector configuration currently being designed at Salford University. Important insights into the thermal conductivity and viscosity with metallic nano-particles is also provided in detail. The analysis is also extendable to other metallic nano-particles including gold and zinc.

Keywords: heat transfer, annular nanofluid solar collector, ANSYS FLUENT, metallic nanoparticles

Procedia PDF Downloads 113