Search results for: stochastic integrals

5 Regional Hydrological Extremes Frequency Analysis Based on Statistical and Hydrological Models

Abstract:

The hydrological extremes frequency analysis is the foundation for the hydraulic engineering design, flood protection, drought management and water resources management and planning to utilize the available water resource to meet the desired objectives of different organizations and sectors in a country. This spatial variation of the statistical characteristics of the extreme flood and drought events are key practice for regional flood and drought analysis and mitigation management. For different hydro-climate of the regions, where the data set is short, scarcity, poor quality and insufﬁcient, the regionalization methods are applied to transfer at-site data to a region. This study aims in regional high and low flow frequency analysis for Poland River Basins. Due to high frequent occurring of hydrological extremes in the region and rapid water resources development in this basin have caused serious concerns over the flood and drought magnitude and frequencies of the river in Poland. The magnitude and frequency result of high and low flows in the basin is needed for flood and drought planning, management and protection at present and future. Hydrological homogeneous high and low flow regions are formed by the cluster analysis of site characteristics, using the hierarchical and C- mean clustering and PCA method. Statistical tests for regional homogeneity are utilized, by Discordancy and Heterogeneity measure tests. In compliance with results of the tests, the region river basin has been divided into ten homogeneous regions. In this study, frequency analysis of high and low flows using AM for high flow and 7-day minimum low flow series is conducted using six statistical distributions. The use of L-moment and LL-moment method showed a homogeneous region over entire province with Generalized logistic (GLOG), Generalized extreme value (GEV), Pearson type III (P-III), Generalized Pareto (GPAR), Weibull (WEI) and Power (PR) distributions as the regional drought and flood frequency distributions. The 95% percentile and Flow duration curves of 1, 7, 10, 30 days have been plotted for 10 stations. However, the cluster analysis performed two regions in west and east of the province where L-moment and LL-moment method demonstrated the homogeneity of the regions and GLOG and Pearson Type III (PIII) distributions as regional frequency distributions for each region, respectively. The spatial variation and regional frequency distribution of flood and drought characteristics for 10 best catchment from the whole region was selected and beside the main variable (streamflow: high and low) we used variables which are more related to physiographic and drainage characteristics for identify and delineate homogeneous pools and to derive best regression models for ungauged sites. Those are mean annual rainfall, seasonal flow, average slope, NDVI, aspect, flow length, flow direction, maximum soil moisture, elevation, and drainage order. The regional high-flow or low-flow relationship among one streamflow characteristics with (AM or 7-day mean annual low flows) some basin characteristics is developed using Generalized Linear Mixed Model (GLMM) and Generalized Least Square (GLS) regression model, providing a simple and effective method for estimation of flood and drought of desired return periods for ungauged catchments.

Keywords: flood , drought, frequency, magnitude, regionalization, stochastic, ungauged, Poland

Procedia PDF Downloads 565

4 Semi-Supervised Learning for Spanish Speech Recognition Using Deep Neural Networks

Authors: B. R. Campomanes-Alvarez, P. Quiros, B. Fernandez

Abstract:

Automatic Speech Recognition (ASR) is a machine-based process of decoding and transcribing oral speech. A typical ASR system receives acoustic input from a speaker or an audio file, analyzes it using algorithms, and produces an output in the form of a text. Some speech recognition systems use Hidden Markov Models (HMMs) to deal with the temporal variability of speech and Gaussian Mixture Models (GMMs) to determine how well each state of each HMM fits a short window of frames of coefficients that represents the acoustic input. Another way to evaluate the fit is to use a feed-forward neural network that takes several frames of coefficients as input and produces posterior probabilities over HMM states as output. Deep neural networks (DNNs) that have many hidden layers and are trained using new methods have been shown to outperform GMMs on a variety of speech recognition systems. Acoustic models for state-of-the-art ASR systems are usually training on massive amounts of data. However, audio files with their corresponding transcriptions can be difficult to obtain, especially in the Spanish language. Hence, in the case of these low-resource scenarios, building an ASR model is considered as a complex task due to the lack of labeled data, resulting in an under-trained system. Semi-supervised learning approaches arise as necessary tasks given the high cost of transcribing audio data. The main goal of this proposal is to develop a procedure based on acoustic semi-supervised learning for Spanish ASR systems by using DNNs. This semi-supervised learning approach consists of: (a) Training a seed ASR model with a DNN using a set of audios and their respective transcriptions. A DNN with a one-hidden-layer network was initialized; increasing the number of hidden layers in training, to a five. A refinement, which consisted of the weight matrix plus bias term and a Stochastic Gradient Descent (SGD) training were also performed. The objective function was the cross-entropy criterion. (b) Decoding/testing a set of unlabeled data with the obtained seed model. (c) Selecting a suitable subset of the validated data to retrain the seed model, thereby improving its performance on the target test set. To choose the most precise transcriptions, three confidence scores or metrics, regarding the lattice concept (based on the graph cost, the acoustic cost and a combination of both), was performed as selection technique. The performance of the ASR system will be calculated by means of the Word Error Rate (WER). The test dataset was renewed in order to extract the new transcriptions added to the training dataset. Some experiments were carried out in order to select the best ASR results. A comparison between a GMM-based model without retraining and the DNN proposed system was also made under the same conditions. Results showed that the semi-supervised ASR-model based on DNNs outperformed the GMM-model, in terms of WER, in all tested cases. The best result obtained an improvement of 6% relative WER. Hence, these promising results suggest that the proposed technique could be suitable for building ASR models in low-resource environments.

Keywords: automatic speech recognition, deep neural networks, machine learning, semi-supervised learning

Procedia PDF Downloads 316

3 Learning Curve Effect on Materials Procurement Schedule of Multiple Sister Ships

Authors: Vijaya Dixit Aasheesh Dixit

Abstract:

Shipbuilding industry operates in Engineer Procure Construct (EPC) context. Product mix of a shipyard comprises of various types of ships like bulk carriers, tankers, barges, coast guard vessels, sub-marines etc. Each order is unique based on the type of ship and customized requirements, which are engineered into the product right from design stage. Thus, to execute every new project, a shipyard needs to upgrade its production expertise. As a result, over the long run, holistic learning occurs across different types of projects which contributes to the knowledge base of the shipyard. Simultaneously, in the short term, during execution of a project comprising of multiple sister ships, repetition of similar tasks leads to learning at activity level. This research aims to capture above learnings of a shipyard and incorporate learning curve effect in project scheduling and materials procurement to improve project performance. Extant literature provides support for the existence of such learnings in an organization. In shipbuilding, there are sequences of similar activities which are expected to exhibit learning curve behavior. For example, the nearly identical structural sub-blocks which are successively fabricated, erected, and outfitted with piping and electrical systems. Learning curve representation can model not only a decrease in mean completion time of an activity, but also a decrease in uncertainty of activity duration. Sister ships have similar material requirements. The same supplier base supplies materials for all the sister ships within a project. On one hand, this provides an opportunity to reduce transportation cost by batching the order quantities of multiple ships. On the other hand, it increases the inventory holding cost at shipyard and the risk of obsolescence. Further, due to learning curve effect the production scheduled of each consequent ship gets compressed. Thus, the material requirement schedule of every next ship differs from its previous ship. As more and more ships get constructed, compressed production schedules increase the possibility of batching the orders of sister ships. This work aims at integrating materials management with project scheduling of long duration projects for manufacturing of multiple sister ships. It incorporates the learning curve effect on progressively compressing material requirement schedules and addresses the above trade-off of transportation cost and inventory holding and shortage costs while satisfying budget constraints of various stages of the project. The activity durations and lead time of items are not crisp and are available in the form of probabilistic distribution. A Stochastic Mixed Integer Programming (SMIP) model is formulated which is solved using evolutionary algorithm. Its output provides ordering dates of items and degree of order batching for all types of items. Sensitivity analysis determines the threshold number of sister ships required in a project to leverage the advantage of learning curve effect in materials management decisions. This analysis will help materials managers to gain insights about the scenarios: when and to what degree is it beneficial to treat a multiple ship project as an integrated one by batching the order quantities and when and to what degree to practice distinctive procurement for individual ship.

Keywords: learning curve, materials management, shipbuilding, sister ships

Procedia PDF Downloads 471

2 Modeling Competition Between Subpopulations with Variable DNA Content in Resource-Limited Microenvironments

Authors: Parag Katira, Frederika Rentzeperis, Zuzanna Nowicka, Giada Fiandaca, Thomas Veith, Jack Farinhas, Noemi Andor

Abstract:

Resource limitations shape the outcome of competitions between genetically heterogeneous pre-malignant cells. One example of such heterogeneity is in the ploidy (DNA content) of pre-malignant cells. A whole-genome duplication (WGD) transforms a diploid cell into a tetraploid one and has been detected in 28-56% of human cancers. If a tetraploid subclone expands, it consistently does so early in tumor evolution, when cell density is still low, and competition for nutrients is comparatively weak – an observation confirmed for several tumor types. WGD+ cells need more resources to synthesize increasing amounts of DNA, RNA, and proteins. To quantify resource limitations and how they relate to ploidy, we performed a PAN cancer analysis of WGD, PET/CT, and MRI scans. Segmentation of >20 different organs from >900 PET/CT scans were performed with MOOSE. We observed a strong correlation between organ-wide population-average estimates of Oxygen and the average ploidy of cancers growing in the respective organ (Pearson R = 0.66; P= 0.001). In-vitro experiments using near-diploid and near-tetraploid lineages derived from a breast cancer cell line supported the hypothesis that DNA content influences Glucose- and Oxygen-dependent proliferation-, death- and migration rates. To model how subpopulations with variable DNA content compete in the resource-limited environment of the human brain, we developed a stochastic state-space model of the brain (S3MB). The model discretizes the brain into voxels, whereby the state of each voxel is defined by 8+ variables that are updated over time: stiffness, Oxygen, phosphate, glucose, vasculature, dead cells, migrating cells and proliferating cells of various DNA content, and treat conditions such as radiotherapy and chemotherapy. Well-established Fokker-Planck partial differential equations govern the distribution of resources and cells across voxels. We applied S3MB on sequencing and imaging data obtained from a primary GBM patient. We performed whole genome sequencing (WGS) of four surgical specimens collected during the 1ˢᵗ and 2ⁿᵈ surgeries of the GBM and used HATCHET to quantify its clonal composition and how it changes between the two surgeries. HATCHET identified two aneuploid subpopulations of ploidy 1.98 and 2.29, respectively. The low-ploidy clone was dominant at the time of the first surgery and became even more dominant upon recurrence. MRI images were available before and after each surgery and registered to MNI space. The S3MB domain was initiated from 4mm³ voxels of the MNI space. T1 post and T2 flair scan acquired after the 1ˢᵗ surgery informed tumor cell densities per voxel. Magnetic Resonance Elastography scans and PET/CT scans informed stiffness and Glucose access per voxel. We performed a parameter search to recapitulate the GBM’s tumor cell density and ploidy composition before the 2ⁿᵈ surgery. Results suggest that the high-ploidy subpopulation had a higher Glucose-dependent proliferation rate (0.70 vs. 0.49), but a lower Glucose-dependent death rate (0.47 vs. 1.42). These differences resulted in spatial differences in the distribution of the two subpopulations. Our results contribute to a better understanding of how genomics and microenvironments interact to shape cell fate decisions and could help pave the way to therapeutic strategies that mimic prognostically favorable environments.

Keywords: tumor evolution, intra-tumor heterogeneity, whole-genome doubling, mathematical modeling

Procedia PDF Downloads 43

1 Deep Learning Based on Image Decomposition for Restoration of Intrinsic Representation

Authors: Hyohun Kim, Dongwha Shin, Yeonseok Kim, Ji-Su Ahn, Kensuke Nakamura, Dongeun Choi, Byung-Woo Hong

Abstract:

Artefacts are commonly encountered in the imaging process of clinical computed tomography (CT) where the artefact refers to any systematic discrepancy between the reconstructed observation and the true attenuation coefficient of the object. It is known that CT images are inherently more prone to artefacts due to its image formation process where a large number of independent detectors are involved, and they are assumed to yield consistent measurements. There are a number of different artefact types including noise, beam hardening, scatter, pseudo-enhancement, motion, helical, ring, and metal artefacts, which cause serious difficulties in reading images. Thus, it is desired to remove nuisance factors from the degraded image leaving the fundamental intrinsic information that can provide better interpretation of the anatomical and pathological characteristics. However, it is considered as a difficult task due to the high dimensionality and variability of data to be recovered, which naturally motivates the use of machine learning techniques. We propose an image restoration algorithm based on the deep neural network framework where the denoising auto-encoders are stacked building multiple layers. The denoising auto-encoder is a variant of a classical auto-encoder that takes an input data and maps it to a hidden representation through a deterministic mapping using a non-linear activation function. The latent representation is then mapped back into a reconstruction the size of which is the same as the size of the input data. The reconstruction error can be measured by the traditional squared error assuming the residual follows a normal distribution. In addition to the designed loss function, an effective regularization scheme using residual-driven dropout determined based on the gradient at each layer. The optimal weights are computed by the classical stochastic gradient descent algorithm combined with the back-propagation algorithm. In our algorithm, we initially decompose an input image into its intrinsic representation and the nuisance factors including artefacts based on the classical Total Variation problem that can be efficiently optimized by the convex optimization algorithm such as primal-dual method. The intrinsic forms of the input images are provided to the deep denosing auto-encoders with their original forms in the training phase. In the testing phase, a given image is first decomposed into the intrinsic form and then provided to the trained network to obtain its reconstruction. We apply our algorithm to the restoration of the corrupted CT images by the artefacts. It is shown that our algorithm improves the readability and enhances the anatomical and pathological properties of the object. The quantitative evaluation is performed in terms of the PSNR, and the qualitative evaluation provides significant improvement in reading images despite degrading artefacts. The experimental results indicate the potential of our algorithm as a prior solution to the image interpretation tasks in a variety of medical imaging applications. This work was supported by the MISP(Ministry of Science and ICT), Korea, under the National Program for Excellence in SW (20170001000011001) supervised by the IITP(Institute for Information and Communications Technology Promotion).

Keywords: auto-encoder neural network, CT image artefact, deep learning, intrinsic image representation, noise reduction, total variation

Procedia PDF Downloads 165