Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 24318

Search results for: incomplete data

24288 A Multivariate 4/2 Stochastic Covariance Model: Properties and Applications to Portfolio Decisions

Authors: Yuyang Cheng, Marcos Escobar-Anel

Abstract:

This paper introduces a multivariate 4/2 stochastic covariance process generalizing the one-dimensional counterparts presented in Grasselli (2017). Our construction permits stochastic correlation not only among stocks but also among volatilities, also known as co-volatility movements, both driven by more convenient 4/2 stochastic structures. The parametrization is flexible enough to separate these types of correlation, permitting their individual study. Conditions for proper changes of measure and closed-form characteristic functions under risk-neutral and historical measures are provided, allowing for applications of the model to risk management and derivative pricing. We apply the model to an expected utility theory problem in incomplete markets. Our analysis leads to closed-form solutions for the optimal allocation and value function. Conditions are provided for well-defined solutions together with a verification theorem. Our numerical analysis highlights and separates the impact of key statistics on equity portfolio decisions, in particular, volatility, correlation, and co-volatility movements, with the latter being the least important in an incomplete market.

Keywords: stochastic covariance process, 4/2 stochastic volatility model, stochastic co-volatility movements, characteristic function, expected utility theory, verication theorem

Procedia PDF Downloads 124

24287 Energy Complementary in Colombia: Imputation of Dataset

Authors: Felipe Villegas-Velasquez, Harold Pantoja-Villota, Sergio Holguin-Cardona, Alejandro Osorio-Botero, Brayan Candamil-Arango

Abstract:

Colombian electricity comes mainly from hydric resources, affected by environmental variations such as the El Niño phenomenon. That is why incorporating other types of resources is necessary to provide electricity constantly. This research seeks to fill the wind speed and global solar irradiance dataset for two years with the highest amount of information. A further result is the characterization of the data by region that led to infer which errors occurred and offered the incomplete dataset.

Keywords: energy, wind speed, global solar irradiance, Colombia, imputation

Procedia PDF Downloads 113

24286 Gradient Boosted Trees on Spark Platform for Supervised Learning in Health Care Big Data

Authors: Gayathri Nagarajan, L. D. Dhinesh Babu

Abstract:

Health care is one of the prominent industries that generate voluminous data thereby finding the need of machine learning techniques with big data solutions for efficient processing and prediction. Missing data, incomplete data, real time streaming data, sensitive data, privacy, heterogeneity are few of the common challenges to be addressed for efficient processing and mining of health care data. In comparison with other applications, accuracy and fast processing are of higher importance for health care applications as they are related to the human life directly. Though there are many machine learning techniques and big data solutions used for efficient processing and prediction in health care data, different techniques and different frameworks are proved to be effective for different applications largely depending on the characteristics of the datasets. In this paper, we present a framework that uses ensemble machine learning technique gradient boosted trees for data classification in health care big data. The framework is built on Spark platform which is fast in comparison with other traditional frameworks. Unlike other works that focus on a single technique, our work presents a comparison of six different machine learning techniques along with gradient boosted trees on datasets of different characteristics. Five benchmark health care datasets are considered for experimentation, and the results of different machine learning techniques are discussed in comparison with gradient boosted trees. The metric chosen for comparison is misclassification error rate and the run time of the algorithms. The goal of this paper is to i) Compare the performance of gradient boosted trees with other machine learning techniques in Spark platform specifically for health care big data and ii) Discuss the results from the experiments conducted on datasets of different characteristics thereby drawing inference and conclusion. The experimental results show that the accuracy is largely dependent on the characteristics of the datasets for other machine learning techniques whereas gradient boosting trees yields reasonably stable results in terms of accuracy without largely depending on the dataset characteristics.

Keywords: big data analytics, ensemble machine learning, gradient boosted trees, Spark platform

Procedia PDF Downloads 216

24285 Influence of Processing Parameters on the Reliability of Sieving as a Particle Size Distribution Measurements

Authors: Eseldin Keleb

Abstract:

In the pharmaceutical industry particle size distribution is an important parameter for the characterization of pharmaceutical powders. The powder flowability, reactivity and compatibility, which have a decisive impact on the final product, are determined by particle size and size distribution. Therefore, the aim of this study was to evaluate the influence of processing parameters on the particle size distribution measurements. Different Size fractions of α-lactose monohydrate and 5% polyvinylpyrrolidone were prepared by wet granulation and were used for the preparation of samples. The influence of sieve load (50, 100, 150, 200, 250, 300, and 350 g), processing time (5, 10, and 15 min), sample size ratios (high percentage of small and large particles), type of disturbances (vibration and shaking) and process reproducibility have been investigated. Results obtained showed that a sieve load of 50 g produce the best separation, a further increase in sample weight resulted in incomplete separation even after the extension of the processing time for 15 min. Performing sieving using vibration was rapider and more efficient than shaking. Meanwhile between day reproducibility showed that particle size distribution measurements are reproducible. However, for samples containing 70% fines or 70% large particles, which processed at optimized parameters, the incomplete separation was always observed. These results indicated that sieving reliability is highly influenced by the particle size distribution of the sample and care must be taken for samples with particle size distribution skewness.

Keywords: sieving, reliability, particle size distribution, processing parameters

Procedia PDF Downloads 580

24284 Expand Rabies Post-Exposure Prophylaxis to Where It Is Needed the Most

Authors: Henry Wilde, Thiravat Hemachudha

Abstract:

Human rabies deaths are underreported worldwide at 55,000 annual cases; more than of dengue and Japanese encephalitis. Almost half are children. A recent study from the Philippines of nearly 2,000 rabies deaths revealed that none of had received incomplete or no post exposure prophylaxis. Coming from a canine rabies endemic country, this is not unique. There are two major barriers to reducing human rabies deaths: 1) the large number of unvaccinated dogs and 2) post-exposure prophylaxis (PEP) that is not available, incomplete, not affordable, or not within reach for bite victims travel means. Only the first barrier, inadequate vaccination of dogs, is now being seriously addressed. It is also often not done effectively or sustainably. Rabies PEP has evolved as a complex, prolonged process, usually delegated to centers in larger cities. It is virtually unavailable in villages or small communities where most dog bites occur, victims are poor and usually unable to travel a long distance multiple times to receive PEP. Reseacrh that led to better understanding of the pathophysiology of rabies and immune responses to potent vaccines and immunoglobulin have allowed shortening and making PEP more evidence based. This knowledge needs to be adopted and applied so that PEP can be rendered safely and affordably where needed the most: by village health care workers who have long performed more complex services after appropriate training. Recent research makes this an important and long neglected goal that is now within our means to implement.

Keywords: rabies, post-exposure prophylaxis, availability, immunoglobulin

Procedia PDF Downloads 238

24283 The Normal-Generalized Hyperbolic Secant Distribution: Properties and Applications

Authors: Hazem M. Al-Mofleh

Abstract:

In this paper, a new four-parameter univariate continuous distribution called the Normal-Generalized Hyperbolic Secant Distribution (NGHS) is defined and studied. Some general and structural distributional properties are investigated and discussed, including: central and non-central n-th moments and incomplete moments, quantile and generating functions, hazard function, Rényi and Shannon entropies, shapes: skewed right, skewed left, and symmetric, modality regions: unimodal and bimodal, maximum likelihood (MLE) estimators for the parameters. Finally, two real data sets are used to demonstrate empirically its flexibility and prove the strength of the new distribution.

Keywords: bimodality, estimation, hazard function, moments, Shannon’s entropy

Procedia PDF Downloads 310

24282 Marginalized Two-Part Joint Models for Generalized Gamma Family of Distributions

Authors: Mohadeseh Shojaei Shahrokhabadi, Ding-Geng (Din) Chen

Abstract:

Positive continuous outcomes with a substantial number of zero values and incomplete longitudinal follow-up are quite common in medical cost data. To jointly model semi-continuous longitudinal cost data and survival data and to provide marginalized covariate effect estimates, a marginalized two-part joint model (MTJM) has been developed for outcome variables with lognormal distributions. In this paper, we propose MTJM models for outcome variables from a generalized gamma (GG) family of distributions. The GG distribution constitutes a general family that includes approximately all of the most frequently used distributions like the Gamma, Exponential, Weibull, and Log Normal. In the proposed MTJM-GG model, the conditional mean from a conventional two-part model with a three-parameter GG distribution is parameterized to provide the marginal interpretation for regression coefficients. In addition, MTJM-gamma and MTJM-Weibull are developed as special cases of MTJM-GG. To illustrate the applicability of the MTJM-GG, we applied the model to a set of real electronic health record data recently collected in Iran, and we provided SAS code for application. The simulation results showed that when the outcome distribution is unknown or misspecified, which is usually the case in real data sets, the MTJM-GG consistently outperforms other models. The GG family of distribution facilitates estimating a model with improved fit over the MTJM-gamma, standard Weibull, or Log-Normal distributions.

Keywords: marginalized two-part model, zero-inflated, right-skewed, semi-continuous, generalized gamma

Procedia PDF Downloads 149

24281 Linear Frequency Modulation-Frequency Shift Keying Radar with Compressive Sensing

Authors: Ho Jeong Jin, Chang Won Seo, Choon Sik Cho, Bong Yong Choi, Kwang Kyun Na, Sang Rok Lee

Abstract:

In this paper, a radar signal processing technique using the LFM-FSK (Linear Frequency Modulation-Frequency Shift Keying) is proposed for reducing the false alarm rate based on the compressive sensing. The LFM-FSK method combines FMCW (Frequency Modulation Continuous Wave) signal with FSK (Frequency Shift Keying). This shows an advantage which can suppress the ghost phenomenon without the complicated CFAR (Constant False Alarm Rate) algorithm. Moreover, the parametric sparse algorithm applying the compressive sensing that restores signals efficiently with respect to the incomplete data samples is also integrated, leading to reducing the burden of ADC in the receiver of radars. 24 GHz FMCW signal is applied and tested in the real environment with FSK modulated data for verifying the proposed algorithm along with the compressive sensing.

Keywords: compressive sensing, LFM-FSK radar, radar signal processing, sparse algorithm

Procedia PDF Downloads 445

24280 Retail Strategy to Reduce Waste Keeping High Profit Utilizing Taylor's Law in Point-of-Sales Data

Authors: Gen Sakoda, Hideki Takayasu, Misako Takayasu

Abstract:

Waste reduction is a fundamental problem for sustainability. Methods for waste reduction with point-of-sales (POS) data are proposed, utilizing the knowledge of a recent econophysics study on a statistical property of POS data. Concretely, the non-stationary time series analysis method based on the Particle Filter is developed, which considers abnormal fluctuation scaling known as Taylor's law. This method is extended for handling incomplete sales data because of stock-outs by introducing maximum likelihood estimation for censored data. The way for optimal stock determination with pricing the cost of waste reduction is also proposed. This study focuses on the examination of the methods for large sales numbers where Taylor's law is obvious. Numerical analysis using aggregated POS data shows the effectiveness of the methods to reduce food waste maintaining a high profit for large sales numbers. Moreover, the way of pricing the cost of waste reduction reveals that a small profit loss realizes substantial waste reduction, especially in the case that the proportionality constant of Taylor’s law is small. Specifically, around 1% profit loss realizes half disposal at =0.12, which is the actual value of processed food items used in this research. The methods provide practical and effective solutions for waste reduction keeping a high profit, especially with large sales numbers.

Keywords: food waste reduction, particle filter, point-of-sales, sustainable development goals, Taylor's law, time series analysis

Procedia PDF Downloads 104

24279 Molecular Alterations Shed Light on Alteration of Methionine Metabolism in Gastric Intestinal Metaplesia; Insight for Treatment Approach

Authors: Nigatu Tadesse, Ying Liu, Juan Li, Hong Ming Liu

Abstract:

Gastric carcinogenesis is a lengthy process of histopathological transition from normal to atrophic gastritis (AG) to intestinal metaplasia (GIM), dysplasia toward gastric cancer (GC). The stage of GIM identified as precancerous lesions with resistance to H-pylori eradication and recurrence after endoscopic surgical resection therapies. GIM divided in to two morphologically distinct phenotypes such as complete GIM bearing intestinal type morphology whereas the incomplete type has colonic type morphology. The incomplete type GIM considered to be the greatest risk factor for the development of GC. Studies indicated the expression of the caudal type homeobox 2 (CDX2) gene is responsible for the development of complete GIM but its progressive downregulation from incomplete metaplasia toward advanced GC identified as the risk for IM progression and neoplastic transformation. The downregulation of CDX2 gene have promoted cell growth and proliferation in gastric and colon cancers and ascribed in chemo-treatment inefficacies. CDX2 downregulated through promoter region hypermethylation in which the methylation frequency positively correlated with the dietary history of the patients, suggesting the role of diet as methyl carbon donor sources such as methionine. However, the metabolism of exogenous methionine is yet unclear. Targeting exogenous methionine metabolism has become a promising approach to limits tumor cell growth, proliferation and progression and increase treatment outcome. This review article discusses molecular alterations that could shed light on the potential of exogenous methionine metabolisms, such as gut microbiota alteration as sources of methionine to host cells, metabolic pathway signaling via PI3K/AKt/mTORC1-c-MYC to rewire exogenous methionine and signature of increased gene methylation index, cell growth and proliferation in GIM, with insights to new treatment avenue via targeting methionine metabolism, and the need for future integrated studies on molecular alterations and metabolomics to uncover altered methionine metabolism and characterization of CDX2 methylation in gastric intestinal metaplasia for potential therapeutic exploitation.

Keywords: altered methionine metabolism, Intestinal metaplesia, CDX2 gene, gastric cancer

Procedia PDF Downloads 33

24278 Comparative Study of Estimators of Population Means in Two Phase Sampling in the Presence of Non-Response

Authors: Syed Ali Taqi, Muhammad Ismail

Abstract:

A comparative study of estimators of population means in two phase sampling in the presence of non-response when Unknown population means of the auxiliary variable(s) and incomplete information of study variable y as well as of auxiliary variable(s) is made. Three real data sets of University students, hospital and unemployment are used for comparison of all the available techniques in two phase sampling in the presence of non-response with the newly generalized ratio estimators.

Keywords: two-phase sampling, ratio estimator, product estimator, generalized estimators

Procedia PDF Downloads 203

24277 Monte Carlo Methods and Statistical Inference of Multitype Branching Processes

Authors: Ana Staneva, Vessela Stoimenova

Abstract:

A parametric estimation of the MBP with Power Series offspring distribution family is considered in this paper. The MLE for the parameters is obtained in the case when the observable data are incomplete and consist only with the generation sizes of the family tree of MBP. The parameter estimation is calculated by using the Monte Carlo EM algorithm. The estimation for the posterior distribution and for the offspring distribution parameters are calculated by using the Bayesian approach and the Gibbs sampler. The article proposes various examples with bivariate branching processes together with computational results, simulation and an implementation using R.

Keywords: Bayesian, branching processes, EM algorithm, Gibbs sampler, Monte Carlo methods, statistical estimation

Procedia PDF Downloads 391

24276 Cleaning of Scientific References in Large Patent Databases Using Rule-Based Scoring and Clustering

Authors: Emiel Caron

Abstract:

Patent databases contain patent related data, organized in a relational data model, and are used to produce various patent statistics. These databases store raw data about scientific references cited by patents. For example, Patstat holds references to tens of millions of scientific journal publications and conference proceedings. These references might be used to connect patent databases with bibliographic databases, e.g. to study to the relation between science, technology, and innovation in various domains. Problematic in such studies is the low data quality of the references, i.e. they are often ambiguous, unstructured, and incomplete. Moreover, a complete bibliographic reference is stored in only one attribute. Therefore, a computerized cleaning and disambiguation method for large patent databases is developed in this work. The method uses rule-based scoring and clustering. The rules are based on bibliographic metadata, retrieved from the raw data by regular expressions, and are transparent and adaptable. The rules in combination with string similarity measures are used to detect pairs of records that are potential duplicates. Due to the scoring, different rules can be combined, to join scientific references, i.e. the rules reinforce each other. The scores are based on expert knowledge and initial method evaluation. After the scoring, pairs of scientific references that are above a certain threshold, are clustered by means of single-linkage clustering algorithm to form connected components. The method is designed to disambiguate all the scientific references in the Patstat database. The performance evaluation of the clustering method, on a large golden set with highly cited papers, shows on average a 99% precision and a 95% recall. The method is therefore accurate but careful, i.e. it weighs precision over recall. Consequently, separate clusters of high precision are sometimes formed, when there is not enough evidence for connecting scientific references, e.g. in the case of missing year and journal information for a reference. The clusters produced by the method can be used to directly link the Patstat database with bibliographic databases as the Web of Science or Scopus.

Keywords: clustering, data cleaning, data disambiguation, data mining, patent analysis, scientometrics

Procedia PDF Downloads 169

24275 3D Model Completion Based on Similarity Search with Slim-Tree

Authors: Alexis Aldo Mendoza Villarroel, Ademir Clemente Villena Zevallos, Cristian Jose Lopez Del Alamo

Abstract:

With the advancement of technology it is now possible to scan entire objects and obtain their digital representation by using point clouds or polygon meshes. However, some objects may be broken or have missing parts; thus, several methods focused on this problem have been proposed based on Geometric Deep Learning, such as GCNN, ACNN, PointNet, among others. In this article an approach from a different paradigm is proposed, using metric data structures to index global descriptors in the spectral domain and allow the recovery of a set of similar models in polynomial time; to later use the Iterative Close Point algorithm and recover the parts of the incomplete model using the geometry and topology of the model with less Hausdorff distance.

Keywords: 3D reconstruction method, point cloud completion, shape completion, similarity search

Procedia PDF Downloads 95

24274 Interrelationship between Quadriceps' Activation and Inhibition as a Function of Knee-Joint Angle and Muscle Length: A Torque and Electro and Mechanomyographic Investigation

Authors: Ronald Croce, Timothy Quinn, John Miller

Abstract:

Incomplete activation, or activation failure, of motor units during maximal voluntary contractions is often referred to as muscle inhibition (MI), and is defined as the inability of the central nervous system to maximally drive a muscle during a voluntary contraction. The purpose of the present study was to assess the interrelationship amongst peak torque (PT), muscle inhibition (MI; incomplete activation of motor units), and voluntary muscle activation (VMA) of the quadriceps’ muscle group as a function of knee angle and muscle length during maximal voluntary isometric contractions (MVICs). Nine young adult males (mean + standard deviation: age: 21.58 + 1.30 years; height: 180.07 + 4.99 cm; weight: 89.07 + 7.55 kg) performed MVICs in random order with the knee at 15, 55, and 95° flexion. MI was assessed using the interpolated twitch technique and was estimated by the amount of additional knee extensor PT evoked by the superimposed twitch during MVICs. Voluntary muscle activation was estimated by root mean square amplitude electromyography (EMGrms) and mechanomyography (MMGrms) of agonist (vastus medialis [VM], vastus lateralis [VL], and rectus femoris [RF]) and antagonist (biceps femoris ([BF]) muscles during MVICs. Data were analyzed using separate repeated measures analysis of variance. Results revealed a strong dependency of quadriceps’ PT (p < 0.001), MI (p < 0.001) and MA (p < 0.01) on knee joint position: PT was smallest at the most shortened muscle position (15°) and greatest at mid-position (55°); MI and MA were smallest at the most shortened muscle position (15°) and greatest at the most lengthened position (95°), with the RF showing the greatest change in MA. It is hypothesized that the ability to more fully activate the quadriceps at short compared to longer muscle lengths (96% contracted at 15°; 91% at 55°; 90% at 95°) might partly compensate for the unfavorable force-length mechanics at the more extended position and consequent declines in VMA (decreases in EMGrms and MMGrms muscle amplitude during MVICs) and force production (PT = 111-Nm at 15°, 217-NM at 55°, 199-Nm at 95°). Biceps femoris EMG and MMG data showed no statistical differences (p = 0.11 and 0.12, respectively) at joint angles tested, although there were greater values at the extended position. Increased BF muscle amplitude at this position could be a mechanism by which anterior shear and tibial rotation induced by high quadriceps’ activity are countered. Measuring and understanding the degree to which one sees MI and VMA in the QF muscle has particular clinical relevance because different knee-joint disorders, such ligament injuries or osteoarthritis, increase levels of MI observed and markedly reduced the capability of full VMA.

Keywords: electromyography, interpolated twitch technique, mechanomyography, muscle activation, muscle inhibition

Procedia PDF Downloads 312

24273 Structural Invertibility and Optimal Sensor Node Placement for Error and Input Reconstruction in Dynamic Systems

Authors: Maik Kschischo, Dominik Kahl, Philipp Wendland, Andreas Weber

Abstract:

Understanding and modelling of real-world complex dynamic systems in biology, engineering and other fields is often made difficult by incomplete knowledge about the interactions between systems states and by unknown disturbances to the system. In fact, most real-world dynamic networks are open systems receiving unknown inputs from their environment. To understand a system and to estimate the state dynamics, these inputs need to be reconstructed from output measurements. Reconstructing the input of a dynamic system from its measured outputs is an ill-posed problem if only a limited number of states is directly measurable. A first requirement for solving this problem is the invertibility of the input-output map. In our work, we exploit the fact that invertibility of a dynamic system is a structural property, which depends only on the network topology. Therefore, it is possible to check for invertibility using a structural invertibility algorithm which counts the number of node disjoint paths linking inputs and outputs. The algorithm is efficient enough, even for large networks up to a million nodes. To understand structural features influencing the invertibility of a complex dynamic network, we analyze synthetic and real networks using the structural invertibility algorithm. We find that invertibility largely depends on the degree distribution and that dense random networks are easier to invert than sparse inhomogeneous networks. We show that real networks are often very difficult to invert unless the sensor nodes are carefully chosen. To overcome this problem, we present a sensor node placement algorithm to achieve invertibility with a minimum set of measured states. This greedy algorithm is very fast and also guaranteed to find an optimal sensor node-set if it exists. Our results provide a practical approach to experimental design for open, dynamic systems. Since invertibility is a necessary condition for unknown input observers and data assimilation filters to work, it can be used as a preprocessing step to check, whether these input reconstruction algorithms can be successful. If not, we can suggest additional measurements providing sufficient information for input reconstruction. Invertibility is also important for systems design and model building. Dynamic models are always incomplete, and synthetic systems act in an environment, where they receive inputs or even attack signals from their exterior. Being able to monitor these inputs is an important design requirement, which can be achieved by our algorithms for invertibility analysis and sensor node placement.

Keywords: data-driven dynamic systems, inversion of dynamic systems, observability, experimental design, sensor node placement

Procedia PDF Downloads 123

24272 Comparing the Apparent Error Rate of Gender Specifying from Human Skeletal Remains by Using Classification and Cluster Methods

Authors: Jularat Chumnaul

Abstract:

In forensic science, corpses from various homicides are different; there are both complete and incomplete, depending on causes of death or forms of homicide. For example, some corpses are cut into pieces, some are camouflaged by dumping into the river, some are buried, some are burned to destroy the evidence, and others. If the corpses are incomplete, it can lead to the difficulty of personally identifying because some tissues and bones are destroyed. To specify gender of the corpses from skeletal remains, the most precise method is DNA identification. However, this method is costly and takes longer so that other identification techniques are used instead. The first technique that is widely used is considering the features of bones. In general, an evidence from the corpses such as some pieces of bones, especially the skull and pelvis can be used to identify their gender. To use this technique, forensic scientists are required observation skills in order to classify the difference between male and female bones. Although this technique is uncomplicated, saving time and cost, and the forensic scientists can fairly accurately determine gender by using this technique (apparently an accuracy rate of 90% or more), the crucial disadvantage is there are only some positions of skeleton that can be used to specify gender such as supraorbital ridge, nuchal crest, temporal lobe, mandible, and chin. Therefore, the skeletal remains that will be used have to be complete. The other technique that is widely used for gender specifying in forensic science and archeology is skeletal measurements. The advantage of this method is it can be used in several positions in one piece of bones, and it can be used even if the bones are not complete. In this study, the classification and cluster analysis are applied to this technique, including the Kth Nearest Neighbor Classification, Classification Tree, Ward Linkage Cluster, K-mean Cluster, and Two Step Cluster. The data contains 507 particular individuals and 9 skeletal measurements (diameter measurements), and the performance of five methods are investigated by considering the apparent error rate (APER). The results from this study indicate that the Two Step Cluster and Kth Nearest Neighbor method seem to be suitable to specify gender from human skeletal remains because both yield small apparent error rate of 0.20% and 4.14%, respectively. On the other hand, the Classification Tree, Ward Linkage Cluster, and K-mean Cluster method are not appropriate since they yield large apparent error rate of 10.65%, 10.65%, and 16.37%, respectively. However, there are other ways to evaluate the performance of classification such as an estimate of the error rate using the holdout procedure or misclassification costs, and the difference methods can make the different conclusions.

Keywords: skeletal measurements, classification, cluster, apparent error rate

Procedia PDF Downloads 227

24271 Combined Analysis of Sudoku Square Designs with Same Treatments

Authors: A. Danbaba

Abstract:

Several experiments are conducted at different environments such as locations or periods (seasons) with identical treatments to each experiment purposely to study the interaction between the treatments and environments or between the treatments and periods (seasons). The commonly used designs of experiments for this purpose are randomized block design, Latin square design, balanced incomplete block design, Youden design, and one or more factor designs. The interest is to carry out a combined analysis of the data from these multi-environment experiments, instead of analyzing each experiment separately. This paper proposed combined analysis of experiments conducted via Sudoku square design of odd order with same experimental treatments.

Keywords: combined analysis, sudoku design, common treatment, multi-environment experiments

Procedia PDF Downloads 318

24270 Challenges of Design, Cost and Surveying in Dams

Authors: Ali Mohammadi

Abstract:

The construction of Embankment dams is considered one of the most challenging construction projects, for which several main reasons can be mentioned. Excavation and embankment must be done in a large area, and its design is based on preliminary studies, but at the time of construction, it is possible that excavation does not match with the stability or slope of the rock, or the design is incomplete, and corrections should be made in order to be able to carry out excavation and embankment. Also, the progress of the work depends on the main factors, the lack of each of which can slow down the construction of the dams, and lead to an increase in costs, and control of excavations and embankments and calculations of their volumes are done in this collection. In the following, we will investigate three Embankment dams in Iran that faced these challenges and how they overcame these challenges. KHODA AFARIN on the Aras River between the two countries of IRAN and AZARBAIJAN, SIAH BISHEH PUMPED STORAGE on CHALUS River and GOTVAND on KARUN River are among the most important dams built in Iran.

Keywords: section, data transfer, tunnel, free station

Procedia PDF Downloads 42

24269 The Data Quality Model for the IoT based Real-time Water Quality Monitoring Sensors

Authors: Rabbia Idrees, Ananda Maiti, Saurabh Garg, Muhammad Bilal Amin

Abstract:

IoT devices are the basic building blocks of IoT network that generate enormous volume of real-time and high-speed data to help organizations and companies to take intelligent decisions. To integrate this enormous data from multisource and transfer it to the appropriate client is the fundamental of IoT development. The handling of this huge quantity of devices along with the huge volume of data is very challenging. The IoT devices are battery-powered and resource-constrained and to provide energy efficient communication, these IoT devices go sleep or online/wakeup periodically and a-periodically depending on the traffic loads to reduce energy consumption. Sometime these devices get disconnected due to device battery depletion. If the node is not available in the network, then the IoT network provides incomplete, missing, and inaccurate data. Moreover, many IoT applications, like vehicle tracking and patient tracking require the IoT devices to be mobile. Due to this mobility, If the distance of the device from the sink node become greater than required, the connection is lost. Due to this disconnection other devices join the network for replacing the broken-down and left devices. This make IoT devices dynamic in nature which brings uncertainty and unreliability in the IoT network and hence produce bad quality of data. Due to this dynamic nature of IoT devices we do not know the actual reason of abnormal data. If data are of poor-quality decisions are likely to be unsound. It is highly important to process data and estimate data quality before bringing it to use in IoT applications. In the past many researchers tried to estimate data quality and provided several Machine Learning (ML), stochastic and statistical methods to perform analysis on stored data in the data processing layer, without focusing the challenges and issues arises from the dynamic nature of IoT devices and how it is impacting data quality. A comprehensive review on determining the impact of dynamic nature of IoT devices on data quality is done in this research and presented a data quality model that can deal with this challenge and produce good quality of data. This research presents the data quality model for the sensors monitoring water quality. DBSCAN clustering and weather sensors are used in this research to make data quality model for the sensors monitoring water quality. An extensive study has been done in this research on finding the relationship between the data of weather sensors and sensors monitoring water quality of the lakes and beaches. The detailed theoretical analysis has been presented in this research mentioning correlation between independent data streams of the two sets of sensors. With the help of the analysis and DBSCAN, a data quality model is prepared. This model encompasses five dimensions of data quality: outliers’ detection and removal, completeness, patterns of missing values and checks the accuracy of the data with the help of cluster’s position. At the end, the statistical analysis has been done on the clusters formed as the result of DBSCAN, and consistency is evaluated through Coefficient of Variation (CoV).

Keywords: clustering, data quality, DBSCAN, and Internet of things (IoT)

Procedia PDF Downloads 108

24268 Firm-Created Social Media Communication and Consumer Brand Perceptions

Authors: Rabail Khalid

Abstract:

Social media has changed the business communication strategies in the corporate world. Firms are using social media to reach their maximum stakeholders in minimum time at different social media forums. The current study examines the role of firm-created social media communication on consumer brand perceptions and their loyalty to brand. An online survey is conducted through social media forums including Facebook and Twitter to collect data regarding social media communication of a well-reputed clothing company’s brand in Pakistan. A link is sent to 900 customers of that company. Out of 900 questionnaires, 534 were received. So, the response rate is 59.33%. During data screening and entry, 13 questionnaires are rejected due to incomplete answer. Therefore, 521 questionnaires are completed in all respect and seem to be helpful for the study. So, the positive response rate is 57.89%. The empirical results report positive and significant influence of company-generated social media communication on brand trust, brand equity, and brand loyalty. The findings of this study provide important information to the marketing professionals and brand managers to understand consumer behavior through social media communication.

Keywords: firm-created social media communication, brand trust, brand equity, consumer behavior, brand loyalty

Procedia PDF Downloads 352

24267 Heat Transfer and Diffusion Modelling

Authors: R. Whalley

Abstract:

The heat transfer modelling for a diffusion process will be considered. Difficulties in computing the time-distance dynamics of the representation will be addressed. Incomplete and irrational Laplace function will be identified as the computational issue. Alternative approaches to the response evaluation process will be provided. An illustration application problem will be presented. Graphical results confirming the theoretical procedures employed will be provided.

Keywords: heat, transfer, diffusion, modelling, computation

Procedia PDF Downloads 522

24266 Exploring Time-Series Phosphoproteomic Datasets in the Context of Network Models

Authors: Sandeep Kaur, Jenny Vuong, Marcel Julliard, Sean O'Donoghue

Abstract:

Time-series data are useful for modelling as they can enable model-evaluation. However, when reconstructing models from phosphoproteomic data, often non-exact methods are utilised, as the knowledge regarding the network structure, such as, which kinases and phosphatases lead to the observed phosphorylation state, is incomplete. Thus, such reactions are often hypothesised, which gives rise to uncertainty. Here, we propose a framework, implemented via a web-based tool (as an extension to Minardo), which given time-series phosphoproteomic datasets, can generate κ models. The incompleteness and uncertainty in the generated model and reactions are clearly presented to the user via the visual method. Furthermore, we demonstrate, via a toy EGF signalling model, the use of algorithmic verification to verify κ models. Manually formulated requirements were evaluated with regards to the model, leading to the highlighting of the nodes causing unsatisfiability (i.e. error causing nodes). We aim to integrate such methods into our web-based tool and demonstrate how the identified erroneous nodes can be presented to the user via the visual method. Thus, in this research we present a framework, to enable a user to explore phosphorylation proteomic time-series data in the context of models. The observer can visualise which reactions in the model are highly uncertain, and which nodes cause incorrect simulation outputs. A tool such as this enables an end-user to determine the empirical analysis to perform, to reduce uncertainty in the presented model - thus enabling a better understanding of the underlying system.

Keywords: κ-models, model verification, time-series phosphoproteomic datasets, uncertainty and error visualisation

Procedia PDF Downloads 227

24265 Maximum Likelihood Estimation Methods on a Two-Parameter Rayleigh Distribution under Progressive Type-Ii Censoring

Authors: Daniel Fundi Murithi

Abstract:

Data from economic, social, clinical, and industrial studies are in some way incomplete or incorrect due to censoring. Such data may have adverse effects if used in the estimation problem. We propose the use of Maximum Likelihood Estimation (MLE) under a progressive type-II censoring scheme to remedy this problem. In particular, maximum likelihood estimates (MLEs) for the location (µ) and scale (λ) parameters of two Parameter Rayleigh distribution are realized under a progressive type-II censoring scheme using the Expectation-Maximization (EM) and the Newton-Raphson (NR) algorithms. These algorithms are used comparatively because they iteratively produce satisfactory results in the estimation problem. The progressively type-II censoring scheme is used because it allows the removal of test units before the termination of the experiment. Approximate asymptotic variances and confidence intervals for the location and scale parameters are derived/constructed. The efficiency of EM and the NR algorithms is compared given root mean squared error (RMSE), bias, and the coverage rate. The simulation study showed that in most sets of simulation cases, the estimates obtained using the Expectation-maximization algorithm had small biases, small variances, narrower/small confidence intervals width, and small root of mean squared error compared to those generated via the Newton-Raphson (NR) algorithm. Further, the analysis of a real-life data set (data from simple experimental trials) showed that the Expectation-Maximization (EM) algorithm performs better compared to Newton-Raphson (NR) algorithm in all simulation cases under the progressive type-II censoring scheme.

Keywords: expectation-maximization algorithm, maximum likelihood estimation, Newton-Raphson method, two-parameter Rayleigh distribution, progressive type-II censoring

Procedia PDF Downloads 129

24264 Evidence Theory Based Emergency Multi-Attribute Group Decision-Making: Application in Facility Location Problem

Authors: Bidzina Matsaberidze

Abstract:

It is known that, in emergency situations, multi-attribute group decision-making (MAGDM) models are characterized by insufficient objective data and a lack of time to respond to the task. Evidence theory is an effective tool for describing such incomplete information in decision-making models when the expert and his knowledge are involved in the estimations of the MAGDM parameters. We consider an emergency decision-making model, where expert assessments on humanitarian aid from distribution centers (HADC) are represented in q-rung ortho-pair fuzzy numbers, and the data structure is described within the data body theory. Based on focal probability construction and experts’ evaluations, an objective function-distribution centers’ selection ranking index is constructed. Our approach for solving the constructed bicriteria partitioning problem consists of two phases. In the first phase, based on the covering’s matrix, we generate a matrix, the columns of which allow us to find all possible partitionings of the HADCs with the service centers. Some constraints are also taken into consideration while generating the matrix. In the second phase, based on the matrix and using our exact algorithm, we find the partitionings -allocations of the HADCs to the centers- which correspond to the Pareto-optimal solutions. For an illustration of the obtained results, a numerical example is given for the facility location-selection problem.

Keywords: emergency MAGDM, q-rung orthopair fuzzy sets, evidence theory, HADC, facility location problem, multi-objective combinatorial optimization problem, Pareto-optimal solutions

Procedia PDF Downloads 63

24263 An Audit of Restaging Transurethral Resection of Bladder Tumor (Re-TURBT) Quality in a District General Hospital

Authors: Rizwan Iqbal

Abstract:

Introduction: Re-TURBT has been recommended by international guidelines for patients with non-muscle invasive bladder cancer (NMIBC) who are deemed high-risk. Indications for re-TURBTs remain controversial and studies show mixed outcomes. It should be performed when the initial TURBT specimen lacks detrusor muscle, has tumor stage pT1 or G3/high-grade, or where resection is deemed incomplete. This ensures complete resection of tumors that have a high risk of recurrence as well as accurately identifying any tumors which have been upstaged. The aim of this audit was to evaluate the quality of re-TURBTs in a district general hospital. Method: Data were retrospectively collected from 31 patients who had re-TURBTs between April 2021 and September 2022. Data included baseline demographics, time from initial to re-TURBT, quality of operation note, presence of residual tumor, complications, and administration of chemotherapy within 24 hours of the initial TURBT. Data collection remains ongoing at the time of writing. Results: The mean age was 76 years old and 71.0% of patients were male. 32.3% of patients had their re-TURBT within six weeks and 32.3% had intravesical chemotherapy administered within 24 hours of the initial TURBT. 74.2% of initial TURBTs had detrusor muscle present in the specimen. 48.4% of patients had residual disease following re-TURBT. Just one patient had their pathology upstaged at re-TURBT. The use of the TURBT proforma on the operation note was variable, with 51.6% and 38.7% of surgeons using the proforma after the initial and re-TURBT. Conclusion: Re-TURBT improves bladder cancer staging and is necessary in patients who are deemed high-risk in order to identify any upstaging or recurrence of the disease.

Keywords: urology, bladder cancer, turbt, cancer

Procedia PDF Downloads 41

24262 Modelling Hydrological Time Series Using Wakeby Distribution

Authors: Ilaria Lucrezia Amerise

Abstract:

The statistical modelling of precipitation data for a given portion of territory is fundamental for the monitoring of climatic conditions and for Hydrogeological Management Plans (HMP). This modelling is rendered particularly complex by the changes taking place in the frequency and intensity of precipitation, presumably to be attributed to the global climate change. This paper applies the Wakeby distribution (with 5 parameters) as a theoretical reference model. The number and the quality of the parameters indicate that this distribution may be the appropriate choice for the interpolations of the hydrological variables and, moreover, the Wakeby is particularly suitable for describing phenomena producing heavy tails. The proposed estimation methods for determining the value of the Wakeby parameters are the same as those used for density functions with heavy tails. The commonly used procedure is the classic method of moments weighed with probabilities (probability weighted moments, PWM) although this has often shown difficulty of convergence, or rather, convergence to a configuration of inappropriate parameters. In this paper, we analyze the problem of the likelihood estimation of a random variable expressed through its quantile function. The method of maximum likelihood, in this case, is more demanding than in the situations of more usual estimation. The reasons for this lie, in the sampling and asymptotic properties of the estimators of maximum likelihood which improve the estimates obtained with indications of their variability and, therefore, their accuracy and reliability. These features are highly appreciated in contexts where poor decisions, attributable to an inefficient or incomplete information base, can cause serious damages.

Keywords: generalized extreme values, likelihood estimation, precipitation data, Wakeby distribution

Procedia PDF Downloads 111

24261 Biodiversity of Platyhelminthes Parasites on Batoids (Elasmobranchii) Fishes from the Algerian Coasts: First Annotated Inventory

Authors: Fadila Tazerouti, Affaf Boukadoum, Kamilia Gharbi, Karima Benmeslem

Abstract:

Parasites are recognized as an important component of biodiversity because of their crucial role in providing valuable information on host populations and in the functioning and balance of natural ecosystems. Although the knowledge about these pathogen organisms' diversity has increased these last years, many species still need to be identified and more investigations should be performed. Batoid fishes represent a significant biological resource, especially among populations of the Mediterranean basin. However, the data on their parasitic fauna, particularly in Algeria, remains unknown and still incomplete. Therefore, the aim of this study is to survey and provide data on the biodiversity of Platyhelminthes parasites of Elasmobranches fishes from Algerian coasts. 3217 specimens of Batoids belonging to 4 families, Topedinidae, Rajdae, Dasyatidae and Myliobatidae, caught in several sites on the Algerian coasts, were examined for their parasites. 47 taxa, including 7 new for science and belonging to 2 classes Monogenea and Cestoda, have been identified. Monogeneans presented the highest richness with 24 taxa and 5 new species for science: 4 Amphibdelloides species and one Calicotyle species. Cestodes are represented by 23 taxa and 3 new species: 2 Acanthobothrium and 1 species Echinobothrium. This study allowed us to establish for the first time in Algeria an inventory of Platyhelminthes parasites of this group of Chondrichthyes fish, as well as an invaluable contribution to the knowledge about the parasitic fauna of Algerian and Mediterranean Elasmobranch fishes.

Keywords: parasitic platyhelminthes, biodiversity, elasmobranches, algerian coasts, inventory

Procedia PDF Downloads 45

24260 Effectiveness of Weather Index Insurance for Smallholders in Ethiopia

Authors: Federica Di Marcantonio, Antoine Leblois, Wolfgang Göbel, Hervè Kerdiles

Abstract:

Weather-related shocks can threaten the ability of farmers to maintain their agricultural output and food security levels. Informal coping mechanisms (i.e. migration or community risk sharing) have always played a significant role in mitigating the negative effects of weather-related shocks in Ethiopia, but they have been found to be an incomplete strategy, particularly as a response to covariate shocks. Particularly, as an alternative to the traditional risk pooling products, an innovative form of insurance known as Index-based Insurance has received a lot of attention from researchers and international organizations, leading to an increased number of pilot initiatives in many countries. Despite the potential benefit of the product in protecting the livelihoods of farmers and pastoralists against climate shocks, to date there has been an unexpectedly low uptake. Using information from current pilot projects on index-based insurance in Ethiopia, this paper discusses the determinants of uptake that have so far undermined the scaling-up of the products, by focusing in particular on weather data availability, price affordability and willingness to pay. We found that, aside from data constraint issues, high price elasticity and low willingness to pay represent impediments to the development of the market. These results, bring us to rethink the role of index insurance as products for enhancing smallholders’ response to covariate shocks, and particularly for improving their food security.

Keywords: index-based insurance, willingness to pay, satellite information, Ethiopia

Procedia PDF Downloads 376

24259 Improved Safety Science: Utilizing a Design Hierarchy

Authors: Ulrica Pettersson

Abstract:

Collection of information on incidents is regularly done through pre-printed incident report forms. These tend to be incomplete and frequently lack essential information. ne consequence is that reports with inadequate information, that do not fulfil analysts’ requirements, are transferred into the analysis process. To improve an incident reporting form, theory in design science, witness psychology and interview and questionnaire research has been used. Previously three experiments have been conducted to evaluate the form and shown significant improved results. The form has proved to capture knowledge, regardless of the incidents’ character or context. The aim in this paper is to describe how design science, in more detail a design hierarchy can be used to construct a collection form for improvements in safety science.

Keywords: data collection, design science, incident reports, safety science

Procedia PDF Downloads 182