Search results for: Bayesian analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 27141

Search results for: Bayesian analysis

26931 New Segmentation of Piecewise Moving-Average Model by Using Reversible Jump MCMC Algorithm

Authors: Suparman

Abstract:

This paper addresses the problem of the signal segmentation within a Bayesian framework by using reversible jump MCMC algorithm. The signal is modelled by piecewise constant Moving-Average (MA) model where the numbers of segments, the position of change-point, the order and the coefficient of the MA model for each segment are unknown. The reversible jump MCMC algorithm is then used to generate samples distributed according to the joint posterior distribution of the unknown parameters. These samples allow calculating some interesting features of the posterior distribution. The performance of the methodology is illustrated via several simulation results.

Keywords: piecewise, moving-average model, reversible jump MCMC, signal segmentation

Procedia PDF Downloads 202
26930 Winter – Not Spring - Climate Drives Annual Adult Survival in Common Passerines: A Country-Wide, Multi-Species Modeling Exercise

Authors: Manon Ghislain, Timothée Bonnet, Olivier Gimenez, Olivier Dehorter, Pierre-Yves Henry

Abstract:

Climatic fluctuations affect the demography of animal populations, generating changes in population size, phenology, distribution and community assemblages. However, very few studies have identified the underlying demographic processes. For short-lived species, like common passerine birds, are these changes generated by changes in adult survival or in fecundity and recruitment? This study tests for an effect of annual climatic conditions (spring and winter) on annual, local adult survival at very large spatial (a country, 252 sites), temporal (25 years) and biological (25 species) scales. The Constant Effort Site ringing has allowed the collection of capture - mark - recapture data for 100 000 adult individuals since 1989, over metropolitan France, thus documenting annual, local survival rates of the most common passerine birds. We specifically developed a set of multi-year, multi-species, multi-site Bayesian models describing variations in local survival and recapture probabilities. This method allows for a statistically powerful hierarchical assessment (global versus species-specific) of the effects of climate variables on survival. A major part of between-year variations in survival rate was common to all species (74% of between-year variance), whereas only 26% of temporal variation was species-specific. Although changing spring climate is commonly invoked as a cause of population size fluctuations, spring climatic anomalies (mean precipitation or temperature for March-August) do not impact adult survival: only 1% of between-year variation of species survival is explained by spring climatic anomalies. However, for sedentary birds, winter climatic anomalies (North Atlantic Oscillation) had a significant, quadratic effect on adult survival, birds surviving less during intermediate years than during more extreme years. For migratory birds, we do not detect an effect of winter climatic anomalies (Sahel Rainfall). We will analyze the life history traits (migration, habitat, thermal range) that could explain a different sensitivity of species to winter climate anomalies. Overall, we conclude that changes in population sizes for passerine birds are unlikely to be the consequences of climate-driven mortality (or emigration) in spring but could be induced by other demographic parameters, like fecundity.

Keywords: Bayesian approach, capture-recapture, climate anomaly, constant effort sites scheme, passerine, seasons, survival

Procedia PDF Downloads 271
26929 Multi-Criteria Evolutionary Algorithm to Develop Efficient Schedules for Complex Maintenance Problems

Authors: Sven Tackenberg, Sönke Duckwitz, Andreas Petz, Christopher M. Schlick

Abstract:

This paper introduces an extension to the well-established Resource-Constrained Project Scheduling Problem (RCPSP) to apply it to complex maintenance problems. The problem is to assign technicians to a team which has to process several tasks with multi-level skill requirements during a work shift. Here, several alternative activities for a task allow both, the temporal shift of activities or the reallocation of technicians and tools. As a result, switches from one valid work process variant to another can be considered and may be selected by the developed evolutionary algorithm based on the present skill level of technicians or the available tools. An additional complication of the observed scheduling problem is that the locations of the construction sites are only temporarily accessible during a day. Due to intensive rail traffic, the available time slots for maintenance and repair works are extremely short and are often distributed throughout the day. To identify efficient working periods, a first concept of a Bayesian network is introduced and is integrated into the extended RCPSP with pre-emptive and non-pre-emptive tasks. Thereby, the Bayesian network is used to calculate the probability of a maintenance task to be processed during a specific period of the shift. Focusing on the domain of maintenance of the railway infrastructure in metropolitan areas as the most unproductive implementation process at construction site, the paper illustrates how the extended RCPSP can be applied for maintenance planning support. A multi-criteria evolutionary algorithm with a problem representation is introduced which is capable of revising technician-task allocations, whereas the duration of the task may be stochastic. The approach uses a novel activity list representation to ensure easily describable and modifiable elements which can be converted into detailed shift schedules. Thereby, the main objective is to develop a shift plan which maximizes the utilization of each technician due to a minimization of the waiting times caused by rail traffic. The results of the already implemented core algorithm illustrate a fast convergence towards an optimal team composition for a shift, an efficient sequence of tasks and a high probability of the subsequent implementation due to the stochastic durations of the tasks. In the paper, the algorithm for the extended RCPSP is analyzed in experimental evaluation using real-world example problems with various size, resource complexity, tightness and so forth.

Keywords: maintenance management, scheduling, resource constrained project scheduling problem, genetic algorithms

Procedia PDF Downloads 209
26928 Currency Exchange Rate Forecasts Using Quantile Regression

Authors: Yuzhi Cai

Abstract:

In this paper, we discuss a Bayesian approach to quantile autoregressive (QAR) time series model estimation and forecasting. Together with a combining forecasts technique, we then predict USD to GBP currency exchange rates. Combined forecasts contain all the information captured by the fitted QAR models at different quantile levels and are therefore better than those obtained from individual models. Our results show that an unequally weighted combining method performs better than other forecasting methodology. We found that a median AR model can perform well in point forecasting when the predictive density functions are symmetric. However, in practice, using the median AR model alone may involve the loss of information about the data captured by other QAR models. We recommend that combined forecasts should be used whenever possible.

Keywords: combining forecasts, MCMC, predictive density functions, quantile forecasting, quantile modelling

Procedia PDF Downloads 231
26927 Non-Linear Causality Inference Using BAMLSS and Bi-CAM in Finance

Authors: Flora Babongo, Valerie Chavez

Abstract:

Inferring causality from observational data is one of the fundamental subjects, especially in quantitative finance. So far most of the papers analyze additive noise models with either linearity, nonlinearity or Gaussian noise. We fill in the gap by providing a nonlinear and non-gaussian causal multiplicative noise model that aims to distinguish the cause from the effect using a two steps method based on Bayesian additive models for location, scale and shape (BAMLSS) and on causal additive models (CAM). We have tested our method on simulated and real data and we reached an accuracy of 0.86 on average. As real data, we considered the causality between financial indices such as S&P 500, Nasdaq, CAC 40 and Nikkei, and companies' log-returns. Our results can be useful in inferring causality when the data is heteroskedastic or non-injective.

Keywords: causal inference, DAGs, BAMLSS, financial index

Procedia PDF Downloads 128
26926 Investigating the Behavior of Individual Business Taxpayers: Behavioral Economics Approach

Authors: Yeganeh Mousavi Jahromi, Sahar Dehghan

Abstract:

In Direct Tax Act, penalties and incentives are two strategies for realization of the expected tax revenues. In this study, the interaction between individual businesses' taxpayers' behaviors and National Tax Administration is investigated by using prospect theory which is based on behavioral economics approach. For this purpose, the structure of the tax compliance of the mentioned taxpayers is evaluated via the changes in penalty and incentive rates. In this way, a special questionnaire regarding the items of individual businesses sector of Direct Tax Act was designed for tax compliance evaluation, and the results were obtained using Bayesian Hierarchical method. The results indicate that the investigated individual business taxpayers, at all income levels, were more sensitive toward incentive rates so that this result can be useful for tax policymakers.

Keywords: behavioral economics, prospect theory, tax compliance, penalties, incentives

Procedia PDF Downloads 41
26925 Choosing between the Regression Correlation, the Rank Correlation, and the Correlation Curve

Authors: Roger L. Goodwin

Abstract:

This paper presents a rank correlation curve. The traditional correlation coefficient is valid for both continuous variables and for integer variables using rank statistics. Since the correlation coefficient has already been established in rank statistics by Spearman, such a calculation can be extended to the correlation curve. This paper presents two survey questions. The survey collected non-continuous variables. We will show weak to moderate correlation. Obviously, one question has a negative effect on the other. A review of the qualitative literature can answer which question and why. The rank correlation curve shows which collection of responses has a positive slope and which collection of responses has a negative slope. Such information is unavailable from the flat, "first-glance" correlation statistics.

Keywords: Bayesian estimation, regression model, rank statistics, correlation, correlation curve

Procedia PDF Downloads 435
26924 Analyzing the Impact of Migration on HIV and AIDS Incidence Cases in Malaysia

Authors: Ofosuhene O. Apenteng, Noor Azina Ismail

Abstract:

The human immunodeficiency virus (HIV) that causes acquired immune deficiency syndrome (AIDS) remains a global cause of morbidity and mortality. It has caused panic since its emergence. Relationships between migration and HIV/AIDS have become complex. In the absence of prospectively designed studies, dynamic mathematical models that take into account the migration movement which will give very useful information. We have explored the utility of mathematical models in understanding transmission dynamics of HIV and AIDS and in assessing the magnitude of how migration has impact on the disease. The model was calibrated to HIV and AIDS incidence data from Malaysia Ministry of Health from the period of 1986 to 2011 using Bayesian analysis with combination of Markov chain Monte Carlo method (MCMC) approach to estimate the model parameters. From the estimated parameters, the estimated basic reproduction number was 22.5812. The rate at which the susceptible individual moved to HIV compartment has the highest sensitivity value which is more significant as compared to the remaining parameters. Thus, the disease becomes unstable. This is a big concern and not good indicator from the public health point of view since the aim is to stabilize the epidemic at the disease-free equilibrium. However, these results suggest that the government as a policy maker should make further efforts to curb illegal activities performed by migrants. It is shown that our models reflect considerably the dynamic behavior of the HIV/AIDS epidemic in Malaysia and eventually could be used strategically for other countries.

Keywords: epidemic model, reproduction number, HIV, MCMC, parameter estimation

Procedia PDF Downloads 345
26923 Prediction of PM₂.₅ Concentration in Ulaanbaatar with Deep Learning Models

Authors: Suriya

Abstract:

Rapid socio-economic development and urbanization have led to an increasingly serious air pollution problem in Ulaanbaatar (UB), the capital of Mongolia. PM₂.₅ pollution has become the most pressing aspect of UB air pollution. Therefore, monitoring and predicting PM₂.₅ concentration in UB is of great significance for the health of the local people and environmental management. As of yet, very few studies have used models to predict PM₂.₅ concentrations in UB. Using data from 0:00 on June 1, 2018, to 23:00 on April 30, 2020, we proposed two deep learning models based on Bayesian-optimized LSTM (Bayes-LSTM) and CNN-LSTM. We utilized hourly observed data, including Himawari8 (H8) aerosol optical depth (AOD), meteorology, and PM₂.₅ concentration, as input for the prediction of PM₂.₅ concentrations. The correlation strengths between meteorology, AOD, and PM₂.₅ were analyzed using the gray correlation analysis method; the comparison of the performance improvement of the model by using the AOD input value was tested, and the performance of these models was evaluated using mean absolute error (MAE) and root mean square error (RMSE). The prediction accuracies of Bayes-LSTM and CNN-LSTM deep learning models were both improved when AOD was included as an input parameter. Improvement of the prediction accuracy of the CNN-LSTM model was particularly enhanced in the non-heating season; in the heating season, the prediction accuracy of the Bayes-LSTM model slightly improved, while the prediction accuracy of the CNN-LSTM model slightly decreased. We propose two novel deep learning models for PM₂.₅ concentration prediction in UB, Bayes-LSTM, and CNN-LSTM deep learning models. Pioneering the use of AOD data from H8 and demonstrating the inclusion of AOD input data improves the performance of our two proposed deep learning models.

Keywords: deep learning, AOD, PM2.5, prediction, Ulaanbaatar

Procedia PDF Downloads 22
26922 Prediction of Distillation Curve and Reid Vapor Pressure of Dual-Alcohol Gasoline Blends Using Artificial Neural Network for the Determination of Fuel Performance

Authors: Leonard D. Agana, Wendell Ace Dela Cruz, Arjan C. Lingaya, Bonifacio T. Doma Jr.

Abstract:

The purpose of this paper is to study the predict the fuel performance parameters, which include drivability index (DI), vapor lock index (VLI), and vapor lock potential using distillation curve and Reid vapor pressure (RVP) of dual alcohol-gasoline fuel blends. Distillation curve and Reid vapor pressure were predicted using artificial neural networks (ANN) with macroscopic properties such as boiling points, RVP, and molecular weights as the input layers. The ANN consists of 5 hidden layers and was trained using Bayesian regularization. The training mean square error (MSE) and R-value for the ANN of RVP are 91.4113 and 0.9151, respectively, while the training MSE and R-value for the distillation curve are 33.4867 and 0.9927. Fuel performance analysis of the dual alcohol–gasoline blends indicated that highly volatile gasoline blended with dual alcohols results in non-compliant fuel blends with D4814 standard. Mixtures of low-volatile gasoline and 10% methanol or 10% ethanol can still be blended with up to 10% C3 and C4 alcohols. Intermediate volatile gasoline containing 10% methanol or 10% ethanol can still be blended with C3 and C4 alcohols that have low RVPs, such as 1-propanol, 1-butanol, 2-butanol, and i-butanol. Biography: Graduate School of Chemical, Biological, and Materials Engineering and Sciences, Mapua University, Muralla St., Intramuros, Manila, 1002, Philippines

Keywords: dual alcohol-gasoline blends, distillation curve, machine learning, reid vapor pressure

Procedia PDF Downloads 73
26921 An Automatic Bayesian Classification System for File Format Selection

Authors: Roman Graf, Sergiu Gordea, Heather M. Ryan

Abstract:

This paper presents an approach for the classification of an unstructured format description for identification of file formats. The main contribution of this work is the employment of data mining techniques to support file format selection with just the unstructured text description that comprises the most important format features for a particular organisation. Subsequently, the file format indentification method employs file format classifier and associated configurations to support digital preservation experts with an estimation of required file format. Our goal is to make use of a format specification knowledge base aggregated from a different Web sources in order to select file format for a particular institution. Using the naive Bayes method, the decision support system recommends to an expert, the file format for his institution. The proposed methods facilitate the selection of file format and the quality of a digital preservation process. The presented approach is meant to facilitate decision making for the preservation of digital content in libraries and archives using domain expert knowledge and specifications of file formats. To facilitate decision-making, the aggregated information about the file formats is presented as a file format vocabulary that comprises most common terms that are characteristic for all researched formats. The goal is to suggest a particular file format based on this vocabulary for analysis by an expert. The sample file format calculation and the calculation results including probabilities are presented in the evaluation section.

Keywords: data mining, digital libraries, digital preservation, file format

Procedia PDF Downloads 475
26920 Determining of the Performance of Data Mining Algorithm Determining the Influential Factors and Prediction of Ischemic Stroke: A Comparative Study in the Southeast of Iran

Authors: Y. Mehdipour, S. Ebrahimi, A. Jahanpour, F. Seyedzaei, B. Sabayan, A. Karimi, H. Amirifard

Abstract:

Ischemic stroke is one of the common reasons for disability and mortality. The fourth leading cause of death in the world and the third in some other sources. Only 1/3 of the patients with ischemic stroke fully recover, 1/3 of them end in permanent disability and 1/3 face death. Thus, the use of predictive models to predict stroke has a vital role in reducing the complications and costs related to this disease. Thus, the aim of this study was to specify the effective factors and predict ischemic stroke with the help of DM methods. The present study was a descriptive-analytic study. The population was 213 cases from among patients referring to Ali ibn Abi Talib (AS) Hospital in Zahedan. Data collection tool was a checklist with the validity and reliability confirmed. This study used DM algorithms of decision tree for modeling. Data analysis was performed using SPSS-19 and SPSS Modeler 14.2. The results of the comparison of algorithms showed that CHAID algorithm with 95.7% accuracy has the best performance. Moreover, based on the model created, factors such as anemia, diabetes mellitus, hyperlipidemia, transient ischemic attacks, coronary artery disease, and atherosclerosis are the most effective factors in stroke. Decision tree algorithms, especially CHAID algorithm, have acceptable precision and predictive ability to determine the factors affecting ischemic stroke. Thus, by creating predictive models through this algorithm, will play a significant role in decreasing the mortality and disability caused by ischemic stroke.

Keywords: data mining, ischemic stroke, decision tree, Bayesian network

Procedia PDF Downloads 151
26919 Reinforcement Learning the Born Rule from Photon Detection

Authors: Rodrigo S. Piera, Jailson Sales Ara´ujo, Gabriela B. Lemos, Matthew B. Weiss, John B. DeBrota, Gabriel H. Aguilar, Jacques L. Pienaar

Abstract:

The Born rule was historically viewed as an independent axiom of quantum mechanics until Gleason derived it in 1957 by assuming the Hilbert space structure of quantum measurements [1]. In subsequent decades there have been diverse proposals to derive the Born rule starting from even more basic assumptions [2]. In this work, we demonstrate that a simple reinforcement-learning algorithm, having no pre-programmed assumptions about quantum theory, will nevertheless converge to a behaviour pattern that accords with the Born rule, when tasked with predicting the output of a quantum optical implementation of a symmetric informationally-complete measurement (SIC). Our findings support a hypothesis due to QBism (the subjective Bayesian approach to quantum theory), which states that the Born rule can be thought of as a normative rule for making decisions in a quantum world [3].

Keywords: quantum Bayesianism, quantum theory, quantum information, quantum measurement

Procedia PDF Downloads 72
26918 Estimation and Forecasting with a Quantile AR Model for Financial Returns

Authors: Yuzhi Cai

Abstract:

This talk presents a Bayesian approach to quantile autoregressive (QAR) time series model estimation and forecasting. We establish that the joint posterior distribution of the model parameters and future values is well defined. The associated MCMC algorithm for parameter estimation and forecasting converges to the posterior distribution quickly. We also present a combining forecasts technique to produce more accurate out-of-sample forecasts by using a weighted sequence of fitted QAR models. A moving window method to check the quality of the estimated conditional quantiles is developed. We verify our methodology using simulation studies and then apply it to currency exchange rate data. An application of the method to the USD to GBP daily currency exchange rates will also be discussed. The results obtained show that an unequally weighted combining method performs better than other forecasting methodology.

Keywords: combining forecasts, MCMC, quantile modelling, quantile forecasting, predictive density functions

Procedia PDF Downloads 321
26917 A Time-Varying and Non-Stationary Convolution Spectral Mixture Kernel for Gaussian Process

Authors: Kai Chen, Shuguang Cui, Feng Yin

Abstract:

Gaussian process (GP) with spectral mixture (SM) kernel demonstrates flexible non-parametric Bayesian learning ability in modeling unknown function. In this work a novel time-varying and non-stationary convolution spectral mixture (TN-CSM) kernel with a significant enhancing of interpretability by using process convolution is introduced. A way decomposing the SM component into an auto-convolution of base SM component and parameterizing it to be input dependent is outlined. Smoothly, performing a convolution between two base SM component yields a novel structure of non-stationary SM component with much better generalized expression and interpretation. The TN-CSM perfectly allows compatibility with the stationary SM kernel in terms of kernel form and spectral base ignored and confused by previous non-stationary kernels. On synthetic and real-world datatsets, experiments show the time-varying characteristics of hyper-parameters in TN-CSM and compare the learning performance of TN-CSM with popular and representative non-stationary GP.

Keywords: Gaussian process, spectral mixture, non-stationary, convolution

Procedia PDF Downloads 168
26916 Gaussian Particle Flow Bernoulli Filter for Single Target Tracking

Authors: Hyeongbok Kim, Lingling Zhao, Xiaohong Su, Junjie Wang

Abstract:

The Bernoulli filter is a precise Bayesian filter for single target tracking based on the random finite set theory. The standard Bernoulli filter often underestimates the number of targets. This study proposes a Gaussian particle flow (GPF) Bernoulli filter employing particle flow to migrate particles from prior to posterior positions to improve the performance of the standard Bernoulli filter. By employing the particle flow filter, the computational speed of the Bernoulli filters is significantly improved. In addition, the GPF Bernoulli filter provides a more accurate estimation compared with that of the standard Bernoulli filter. Simulation results confirm the improved tracking performance and computational speed in two- and three-dimensional scenarios compared with other algorithms.

Keywords: Bernoulli filter, particle filter, particle flow filter, random finite sets, target tracking

Procedia PDF Downloads 63
26915 Genomic Characterisation of Equine Sarcoid-derived Bovine Papillomavirus Type 1 and 2 Using Nanopore-Based Sequencing

Authors: Lien Gysens, Bert Vanmechelen, Maarten Haspeslagh, Piet Maes, Ann Martens

Abstract:

Bovine papillomavirus (BPV) types 1 and 2 play a central role in the etiology of the most common neoplasm in horses, the equine sarcoid. The unknown mechanism behind the unique variety in a clinical presentation on the one hand and the host-dependent clinical outcome of BPV-1 infection, on the other hand, indicate the involvement of additional factors. Earlier studies have reported the potential functional significance of intratypic sequence variants, along with the existence of sarcoid-sourced BPV variants. Therefore, intratypic sequence variation seems to be an important emerging viral factor. This study aimed to give a broad insight in sarcoid-sourced BPV variation and explore its potential association with disease presentation. In order to do this, a nanopore sequencing approach was successfully optimized for screening a wide spectrum of clinical samples. Specimens of each tumour were initially screened for BPV-1/-2 by quantitative real-time PCR. A custom-designed primer set was used on BPV-positive samples to amplify the complete viral genome in two multiplex PCR reactions, resulting in a set of overlapping amplicons. For phylogenetic analysis, separate alignments were made of all available complete genome sequences for BPV-1/-2. The resulting alignments were used to infer Bayesian phylogenetic trees. We found substantial genetic variation among sarcoid-derived BPV-1, although this variation could not be linked to disease severity. Several of the BPV-1 genomes had multiple major deletions. Remarkably, the majority of the cluster within the region coding for late viral genes. Together with the extensiveness (up to 603 nucleotides) of the described deletions, this suggests an altered function of L1/L2 in disease pathogenesis. By generating a significant amount of complete-length BPV genomes, we succeeded in introducing next-generation sequencing into veterinary research focusing on the equine sarcoid, thus facilitating the first report of both nanopore-based sequencing of complete sarcoid-sourced BPV-1/-2 and the simultaneous nanopore sequencing of multiple complete genomes originating from a single clinical sample.

Keywords: Bovine papillomavirus, equine sarcoid, horse, nanopore sequencing, phylogenetic analysis

Procedia PDF Downloads 155
26914 Genetic Diversity Analysis in Ecological Populations of Persian Walnut

Authors: Masoud Sheidai, Fahimeh Koohdar, Hashem Sharifi

Abstract:

Juglans regia (L.) commonly known as Persian walnut of the genus Juglans L. (Juglandaceae) is one of the most important cultivated plant species due to its high-quality wood and edible nuts. The genetic diversity analysis is essential for conservation and management of tree species. Persian walnut is native from South-Eastern Europe to North-Western China through Tibet, Nepal, Northern India, Pakistan, and Iran. The species like Persian walnut, which has a wide range of geographical distribution, should harbor extensive genetic variability to adapt to environmental fluctuations they face. We aimed to study the population genetic structure of seven Persian walnut populations including three wild and four cultivated populations by using ISSR (Inter simple sequence repeats) and SRAP (Sequence related amplified polymorphism) molecular markers. We also aimed to compare the genetic variability revealed by ISSR neutral multilocus marker and rDNA ITS sequences. The studied populations differed in morphological features as the samples in each population were clustered together and were separate from the other populations. Three wild populations studied were placed close to each other. The mantel test after 5000 times permutation performed between geographical distance and morphological distance in Persian walnut populations produced significant correlation (r = 0.48, P = 0.002). Therefore, as the populations become farther apart, they become more divergent in morphological features. ISSR analysis produced 47 bands/ loci, while we obtained 15 SRAP bands. Gst and other differentiation statistics determined for these loci revealed that most of the ISSR and SRAP loci have very good discrimination power and can differentiate the studied populations. AMOVA performed for these loci produced a significant difference (< 0.05) supporting the above-said result. AMOVA produced significant genetic difference based on ISSR data among the studied populations (PhiPT = 0.52, P = 0.001). AMOVA revealed that 53% of the total variability is due to among population genetic difference, while 47% is due to within population genetic variability. The results showed that both multilocus molecular markers and ITS sequences can differentiate Persian walnut populations. The studied populations differed genetically and showed isolation by distance (IBD). ITS sequence based MP and Bayesian phylogenetic trees revealed that Iranian walnut cultivars form a distinct clade separated from the cultivars studied from elsewhere. Almost all clades obtained have high bootstrap value. The results indicated that a combination of multilpcus and sequencing molecular markers can be used in genetic differentiation of Persian walnut.

Keywords: genetic diversity, population, molecular markers, genetic difference

Procedia PDF Downloads 136
26913 Wireless Sensor Anomaly Detection Using Soft Computing

Authors: Mouhammd Alkasassbeh, Alaa Lasasmeh

Abstract:

We live in an era of rapid development as a result of significant scientific growth. Like other technologies, wireless sensor networks (WSNs) are playing one of the main roles. Based on WSNs, ZigBee adds many features to devices, such as minimum cost and power consumption, and increasing the range and connect ability of sensor nodes. ZigBee technology has come to be used in various fields, including science, engineering, and networks, and even in medicinal aspects of intelligence building. In this work, we generated two main datasets, the first being based on tree topology and the second on star topology. The datasets were evaluated by three machine learning (ML) algorithms: J48, meta.j48 and multilayer perceptron (MLP). Each topology was classified into normal and abnormal (attack) network traffic. The dataset used in our work contained simulated data from network simulation 2 (NS2). In each database, the Bayesian network meta.j48 classifier achieved the highest accuracy level among other classifiers, of 99.7% and 99.2% respectively.

Keywords: IDS, Machine learning, WSN, ZigBee technology

Procedia PDF Downloads 517
26912 A Review of Spatial Analysis as a Geographic Information Management Tool

Authors: Chidiebere C. Agoha, Armstong C. Awuzie, Chukwuebuka N. Onwubuariri, Joy O. Njoku

Abstract:

Spatial analysis is a field of study that utilizes geographic or spatial information to understand and analyze patterns, relationships, and trends in data. It is characterized by the use of geographic or spatial information, which allows for the analysis of data in the context of its location and surroundings. It is different from non-spatial or aspatial techniques, which do not consider the geographic context and may not provide as complete of an understanding of the data. Spatial analysis is applied in a variety of fields, which includes urban planning, environmental science, geosciences, epidemiology, marketing, to gain insights and make decisions about complex spatial problems. This review paper explores definitions of spatial analysis from various sources, including examples of its application and different analysis techniques such as Buffer analysis, interpolation, and Kernel density analysis (multi-distance spatial cluster analysis). It also contrasts spatial analysis with non-spatial analysis.

Keywords: aspatial technique, buffer analysis, epidemiology, interpolation

Procedia PDF Downloads 283
26911 The Maps of Meaning (MoM) Consciousness Theory

Authors: Scott Andersen

Abstract:

Perhaps simply and rather unadornedly, consciousness is having multiple goals for action and the continuously adjudication of such goals to implement action, referred to as the Maps of Meaning (MoM) Consciousness Theory. The MoM theory triangulates through three parallel corollaries, action (behavior), mechanism (morphology/pathophysiology), and goals (teleology). (1) An organism’s consciousness contains a fluid, nested goals. These goals are not intentionality, but intersectionality, embodiment meeting the world. i.e., Darwinian inclusive fitness or randomization, then survival of the fittest. These goals form via gradual descent under inclusive fitness, the goals being the abstraction of a ‘match’ between the evolutionary environment and organism. Human consciousness implements the brain efficiency hypothesis, genetics, epigenetics, and experience crystallize efficiencies, not necessitating best or objective but fitness, i.e., perceived efficiency based on one’s adaptive environment. These efficiencies are objectively arbitrary, but determine the operation and level of one’s consciousness, termed extreme thrownness. Since inclusive fitness drives efficiencies in physiologic mechanism, morphology and behavior (action) and originates one’s goals, embodiment is necessarily entangled to human consciousness as its the intersection of mechanism or action (both necessitating embodiment) occurring in the world that determines fitness. Perception is the operant process of consciousness and is the consciousness’ de facto goal adjudication process. Goal operationalization is fundamentally efficiency-based via one’s unique neuronal mapping as a byproduct of genetics, epigenetics, and experience. Perception involves information intake and information discrimination, equally underpinned by efficiencies of inclusive fitness via extreme thrownness. Perception isn’t a ‘frame rate,’ but Bayesian priors of efficiency based on one’s extreme thrownness. Consciousness and human consciousness is a modular (i.e., a scalar level of richness, which builds up like building blocks) and dimensionalized (i.e., cognitive abilities become possibilities as emergent phenomena at various modularities, like stratified factors in factor analysis). The meta dimensions of human consciousness seemingly include intelligence quotient, personality (five-factor model), richness of perception intake, and richness of perception discrimination, among other potentialities. Future consciousness research should utilize factor analysis to parse modularities and dimensions of human consciousness and animal models.

Keywords: consciousness, perception, prospection, embodiment

Procedia PDF Downloads 18
26910 Forecasting Model to Predict Dengue Incidence in Malaysia

Authors: W. H. Wan Zakiyatussariroh, A. A. Nasuhar, W. Y. Wan Fairos, Z. A. Nazatul Shahreen

Abstract:

Forecasting dengue incidence in a population can provide useful information to facilitate the planning of the public health intervention. Many studies on dengue cases in Malaysia were conducted but are limited in modeling the outbreak and forecasting incidence. This article attempts to propose the most appropriate time series model to explain the behavior of dengue incidence in Malaysia for the purpose of forecasting future dengue outbreaks. Several seasonal auto-regressive integrated moving average (SARIMA) models were developed to model Malaysia’s number of dengue incidence on weekly data collected from January 2001 to December 2011. SARIMA (2,1,1)(1,1,1)52 model was found to be the most suitable model for Malaysia’s dengue incidence with the least value of Akaike information criteria (AIC) and Bayesian information criteria (BIC) for in-sample fitting. The models further evaluate out-sample forecast accuracy using four different accuracy measures. The results indicate that SARIMA (2,1,1)(1,1,1)52 performed well for both in-sample fitting and out-sample evaluation.

Keywords: time series modeling, Box-Jenkins, SARIMA, forecasting

Procedia PDF Downloads 453
26909 Application of Subversion Analysis in the Search for the Causes of Cracking in a Marine Engine Injector Nozzle

Authors: Leszek Chybowski, Artur Bejger, Katarzyna Gawdzińska

Abstract:

Subversion analysis is a tool used in the TRIZ (Theory of Inventive Problem Solving) methodology. This article introduces the history and describes the process of subversion analysis, as well as function analysis and analysis of the resources, used at the design stage when generating possible undesirable situations. The article charts the course of subversion analysis when applied to a fuel injection nozzle of a marine engine. The work describes the fuel injector nozzle as a technological system and presents principles of analysis for the causes of a cracked tip of the nozzle body. The system is modelled with functional analysis. A search for potential causes of the damage is undertaken and a cause-and-effect analysis for various hypotheses concerning the damage is drawn up. The importance of particular hypotheses is evaluated and the most likely causes of damage identified.

Keywords: complex technical system, fuel injector, function analysis, importance analysis, resource analysis, sabotage analysis, subversion analysis, TRIZ (Theory of Inventive Problem Solving)

Procedia PDF Downloads 592
26908 Comparison of Different Artificial Intelligence-Based Protein Secondary Structure Prediction Methods

Authors: Jamerson Felipe Pereira Lima, Jeane Cecília Bezerra de Melo

Abstract:

The difficulty and cost related to obtaining of protein tertiary structure information through experimental methods, such as X-ray crystallography or NMR spectroscopy, helped raising the development of computational methods to do so. An approach used in these last is prediction of tridimensional structure based in the residue chain, however, this has been proved an NP-hard problem, due to the complexity of this process, explained by the Levinthal paradox. An alternative solution is the prediction of intermediary structures, such as the secondary structure of the protein. Artificial Intelligence methods, such as Bayesian statistics, artificial neural networks (ANN), support vector machines (SVM), among others, were used to predict protein secondary structure. Due to its good results, artificial neural networks have been used as a standard method to predict protein secondary structure. Recent published methods that use this technique, in general, achieved a Q3 accuracy between 75% and 83%, whereas the theoretical accuracy limit for protein prediction is 88%. Alternatively, to achieve better results, support vector machines prediction methods have been developed. The statistical evaluation of methods that use different AI techniques, such as ANNs and SVMs, for example, is not a trivial problem, since different training sets, validation techniques, as well as other variables can influence the behavior of a prediction method. In this study, we propose a prediction method based on artificial neural networks, which is then compared with a selected SVM method. The chosen SVM protein secondary structure prediction method is the one proposed by Huang in his work Extracting Physico chemical Features to Predict Protein Secondary Structure (2013). The developed ANN method has the same training and testing process that was used by Huang to validate his method, which comprises the use of the CB513 protein data set and three-fold cross-validation, so that the comparative analysis of the results can be made comparing directly the statistical results of each method.

Keywords: artificial neural networks, protein secondary structure, protein structure prediction, support vector machines

Procedia PDF Downloads 589
26907 Performance and Limitations of Likelihood Based Information Criteria and Leave-One-Out Cross-Validation Approximation Methods

Authors: M. A. C. S. Sampath Fernando, James M. Curran, Renate Meyer

Abstract:

Model assessment, in the Bayesian context, involves evaluation of the goodness-of-fit and the comparison of several alternative candidate models for predictive accuracy and improvements. In posterior predictive checks, the data simulated under the fitted model is compared with the actual data. Predictive model accuracy is estimated using information criteria such as the Akaike information criterion (AIC), the Bayesian information criterion (BIC), the Deviance information criterion (DIC), and the Watanabe-Akaike information criterion (WAIC). The goal of an information criterion is to obtain an unbiased measure of out-of-sample prediction error. Since posterior checks use the data twice; once for model estimation and once for testing, a bias correction which penalises the model complexity is incorporated in these criteria. Cross-validation (CV) is another method used for examining out-of-sample prediction accuracy. Leave-one-out cross-validation (LOO-CV) is the most computationally expensive variant among the other CV methods, as it fits as many models as the number of observations. Importance sampling (IS), truncated importance sampling (TIS) and Pareto-smoothed importance sampling (PSIS) are generally used as approximations to the exact LOO-CV and utilise the existing MCMC results avoiding expensive computational issues. The reciprocals of the predictive densities calculated over posterior draws for each observation are treated as the raw importance weights. These are in turn used to calculate the approximate LOO-CV of the observation as a weighted average of posterior densities. In IS-LOO, the raw weights are directly used. In contrast, the larger weights are replaced by their modified truncated weights in calculating TIS-LOO and PSIS-LOO. Although, information criteria and LOO-CV are unable to reflect the goodness-of-fit in absolute sense, the differences can be used to measure the relative performance of the models of interest. However, the use of these measures is only valid under specific circumstances. This study has developed 11 models using normal, log-normal, gamma, and student’s t distributions to improve the PCR stutter prediction with forensic data. These models are comprised of four with profile-wide variances, four with locus specific variances, and three which are two-component mixture models. The mean stutter ratio in each model is modeled as a locus specific simple linear regression against a feature of the alleles under study known as the longest uninterrupted sequence (LUS). The use of AIC, BIC, DIC, and WAIC in model comparison has some practical limitations. Even though, IS-LOO, TIS-LOO, and PSIS-LOO are considered to be approximations of the exact LOO-CV, the study observed some drastic deviations in the results. However, there are some interesting relationships among the logarithms of pointwise predictive densities (lppd) calculated under WAIC and the LOO approximation methods. The estimated overall lppd is a relative measure that reflects the overall goodness-of-fit of the model. Parallel log-likelihood profiles for the models conditional on equal posterior variances in lppds were observed. This study illustrates the limitations of the information criteria in practical model comparison problems. In addition, the relationships among LOO-CV approximation methods and WAIC with their limitations are discussed. Finally, useful recommendations that may help in practical model comparisons with these methods are provided.

Keywords: cross-validation, importance sampling, information criteria, predictive accuracy

Procedia PDF Downloads 369
26906 Assessment of Potential Chemical Exposure to Betamethasone Valerate and Clobetasol Propionate in Pharmaceutical Manufacturing Laboratories

Authors: Nadeen Felemban, Hamsa Banjer, Rabaah Jaafari

Abstract:

One of the most common hazards in the pharmaceutical industry is the chemical hazard, which can cause harm or develop occupational health diseases/illnesses due to chronic exposures to hazardous substances. Therefore, a chemical agent management system is required, including hazard identification, risk assessment, controls for specific hazards and inspections, to keep your workplace healthy and safe. However, routine management monitoring is also required to verify the effectiveness of the control measures. Moreover, Betamethasone Valerate and Clobetasol Propionate are some of the APIs (Active Pharmaceutical Ingredients) with highly hazardous classification-Occupational Hazard Category (OHC 4), which requires a full containment (ECA-D) during handling to avoid chemical exposure. According to Safety Data Sheet, those chemicals are reproductive toxicants (reprotoxicant H360D), which may affect female workers’ health and cause fatal damage to an unborn child, or impair fertility. In this study, qualitative (chemical Risk assessment-qCRA) was conducted to assess the chemical exposure during handling of Betamethasone Valerate and Clobetasol Propionate in pharmaceutical laboratories. The outcomes of qCRA identified that there is a risk of potential chemical exposure (risk rating 8 Amber risk). Therefore, immediate actions were taken to ensure interim controls (according to the Hierarchy of controls) are in place and in use to minimize the risk of chemical exposure. No open handlings should be done out of the Steroid Glove Box Isolator (SGB) with the required Personal Protective Equipment (PPEs). The PPEs include coverall, nitrile hand gloves, safety shoes and powered air-purifying respirators (PAPR). Furthermore, a quantitative assessment (personal air sampling) was conducted to verify the effectiveness of the engineering controls (SGB Isolator) and to confirm if there is chemical exposure, as indicated earlier by qCRA. Three personal air samples were collected using an air sampling pump and filter (IOM2 filters, 25mm glass fiber media). The collected samples were analyzed by HPLC in the BV lab, and the measured concentrations were reported in (ug/m3) with reference to Occupation Exposure Limits, 8hr OELs (8hr TWA) for each analytic. The analytical results are needed in 8hr TWA (8hr Time-weighted Average) to be analyzed using Bayesian statistics (IHDataAnalyst). The results of the Bayesian Likelihood Graph indicate (category 0), which means Exposures are de "minimus," trivial, or non-existent Employees have little to no exposure. Also, these results indicate that the 3 samplings are representative samplings with very low variations (SD=0.0014). In conclusion, the engineering controls were effective in protecting the operators from such exposure. However, routine chemical monitoring is required every 3 years unless there is a change in the processor type of chemicals. Also, frequent management monitoring (daily, weekly, and monthly) is required to ensure the control measures are in place and in use. Furthermore, a Similar Exposure Group (SEG) was identified in this activity and included in the annual health surveillance for health monitoring.

Keywords: occupational health and safety, risk assessment, chemical exposure, hierarchy of control, reproductive

Procedia PDF Downloads 153
26905 A Game of Information in Defense/Attack Strategies: Case of Poisson Attacks

Authors: Asma Ben Yaghlane, Mohamed Naceur Azaiez

Abstract:

In this paper, we briefly introduce the concept of Poisson attacks in the case of defense/attack strategies where attacks are assumed to be continuous. We suggest a game model in which the attacker will combine both criteria of a sufficient confidence level of a successful attack and a reasonably small size of the estimation error in order to launch an attack. Here, estimation error arises from assessing the system failure upon attack using aggregate data at the system level. The corresponding error is referred to as aggregation error. On the other hand, the defender will attempt to deter attack by making one or both criteria inapplicable. The defender will build his/her strategy by both strengthening the targeted system and increasing the size of error. We will formulate the defender problem based on appropriate optimization models. The attacker will opt for a Bayesian updating in assessing the impact on the improvement made by the defender. Then, the attacker will evaluate the feasibility of the attack before making the decision of whether or not to launch it. We will provide illustrations to better explain the process.

Keywords: attacker, defender, game theory, information

Procedia PDF Downloads 437
26904 Modeling Food Popularity Dependencies Using Social Media Data

Authors: DEVASHISH KHULBE, MANU PATHAK

Abstract:

The rise in popularity of major social media platforms have enabled people to share photos and textual information about their daily life. One of the popular topics about which information is shared is food. Since a lot of media about food are attributed to particular locations and restaurants, information like spatio-temporal popularity of various cuisines can be analyzed. Tracking the popularity of food types and retail locations across space and time can also be useful for business owners and restaurant investors. In this work, we present an approach using off-the shelf machine learning techniques to identify trends and popularity of cuisine types in an area using geo-tagged data from social media, Google images and Yelp. After adjusting for time, we use the Kernel Density Estimation to get hot spots across the location and model the dependencies among food cuisines popularity using Bayesian Networks. We consider the Manhattan borough of New York City as the location for our analyses but the approach can be used for any area with social media data and information about retail businesses.

Keywords: Web Mining, Geographic Information Systems, Business popularity, Spatial Data Analyses

Procedia PDF Downloads 87
26903 Development of Computational Approach for Calculation of Hydrogen Solubility in Hydrocarbons for Treatment of Petroleum

Authors: Abdulrahman Sumayli, Saad M. AlShahrani

Abstract:

For the hydrogenation process, knowing the solubility of hydrogen (H2) in hydrocarbons is critical to improve the efficiency of the process. We investigated the H2 solubility computation in four heavy crude oil feedstocks using machine learning techniques. Temperature, pressure, and feedstock type were considered as the inputs to the models, while the hydrogen solubility was the sole response. Specifically, we employed three different models: Support Vector Regression (SVR), Gaussian process regression (GPR), and Bayesian ridge regression (BRR). To achieve the best performance, the hyper-parameters of these models are optimized using the whale optimization algorithm (WOA). We evaluated the models using a dataset of solubility measurements in various feedstocks, and we compared their performance based on several metrics. Our results show that the WOA-SVR model tuned with WOA achieves the best performance overall, with an RMSE of 1.38 × 10− 2 and an R-squared of 0.991. These findings suggest that machine learning techniques can provide accurate predictions of hydrogen solubility in different feedstocks, which could be useful in the development of hydrogen-related technologies. Besides, the solubility of hydrogen in the four heavy oil fractions is estimated in different ranges of temperatures and pressures of 150 ◦C–350 ◦C and 1.2 MPa–10.8 MPa, respectively

Keywords: temperature, pressure variations, machine learning, oil treatment

Procedia PDF Downloads 45
26902 Medical Knowledge Management since the Integration of Heterogeneous Data until the Knowledge Exploitation in a Decision-Making System

Authors: Nadjat Zerf Boudjettou, Fahima Nader, Rachid Chalal

Abstract:

Knowledge management is to acquire and represent knowledge relevant to a domain, a task or a specific organization in order to facilitate access, reuse and evolution. This usually means building, maintaining and evolving an explicit representation of knowledge. The next step is to provide access to that knowledge, that is to say, the spread in order to enable effective use. Knowledge management in the medical field aims to improve the performance of the medical organization by allowing individuals in the care facility (doctors, nurses, paramedics, etc.) to capture, share and apply collective knowledge in order to make optimal decisions in real time. In this paper, we propose a knowledge management approach based on integration technique of heterogeneous data in the medical field by creating a data warehouse, a technique of extracting knowledge from medical data by choosing a technique of data mining, and finally an exploitation technique of that knowledge in a case-based reasoning system.

Keywords: data warehouse, data mining, knowledge discovery in database, KDD, medical knowledge management, Bayesian networks

Procedia PDF Downloads 368