Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 296

Search results for: bayesian

56 Breast Cancer Survivability Prediction via Classifier Ensemble

Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia

Abstract:

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.

Keywords: classifier ensemble, breast cancer survivability, data mining, SEER

Procedia PDF Downloads 293

55 Information Communication Technology Based Road Traffic Accidents’ Identification, and Related Smart Solution Utilizing Big Data

Authors: Ghulam Haider Haidaree, Nsenda Lukumwena

Abstract:

Today the world of research enjoys abundant data, available in virtually any field, technology, science, and business, politics, etc. This is commonly referred to as big data. This offers a great deal of precision and accuracy, supportive of an in-depth look at any decision-making process. When and if well used, Big Data affords its users with the opportunity to produce substantially well supported and good results. This paper leans extensively on big data to investigate possible smart solutions to urban mobility and related issues, namely road traffic accidents, its casualties, and fatalities based on multiple factors, including age, gender, location occurrences of accidents, etc. Multiple technologies were used in combination to produce an Information Communication Technology (ICT) based solution with embedded technology. Those technologies include principally Geographic Information System (GIS), Orange Data Mining Software, Bayesian Statistics, to name a few. The study uses the Leeds accident 2016 to illustrate the thinking process and extracts thereof a model that can be tested, evaluated, and replicated. The authors optimistically believe that the proposed model will significantly and smartly help to flatten the curve of road traffic accidents in the fast-growing population densities, which increases considerably motor-based mobility.

Keywords: accident factors, geographic information system, information communication technology, mobility

Procedia PDF Downloads 172

54 Analyzing the Impact of Migration on HIV and AIDS Incidence Cases in Malaysia

Authors: Ofosuhene O. Apenteng, Noor Azina Ismail

Abstract:

The human immunodeficiency virus (HIV) that causes acquired immune deficiency syndrome (AIDS) remains a global cause of morbidity and mortality. It has caused panic since its emergence. Relationships between migration and HIV/AIDS have become complex. In the absence of prospectively designed studies, dynamic mathematical models that take into account the migration movement which will give very useful information. We have explored the utility of mathematical models in understanding transmission dynamics of HIV and AIDS and in assessing the magnitude of how migration has impact on the disease. The model was calibrated to HIV and AIDS incidence data from Malaysia Ministry of Health from the period of 1986 to 2011 using Bayesian analysis with combination of Markov chain Monte Carlo method (MCMC) approach to estimate the model parameters. From the estimated parameters, the estimated basic reproduction number was 22.5812. The rate at which the susceptible individual moved to HIV compartment has the highest sensitivity value which is more significant as compared to the remaining parameters. Thus, the disease becomes unstable. This is a big concern and not good indicator from the public health point of view since the aim is to stabilize the epidemic at the disease-free equilibrium. However, these results suggest that the government as a policy maker should make further efforts to curb illegal activities performed by migrants. It is shown that our models reﬂect considerably the dynamic behavior of the HIV/AIDS epidemic in Malaysia and eventually could be used strategically for other countries.

Keywords: epidemic model, reproduction number, HIV, MCMC, parameter estimation

Procedia PDF Downloads 339

53 Prediction of PM₂.₅ Concentration in Ulaanbaatar with Deep Learning Models

Authors: Suriya

Abstract:

Rapid socio-economic development and urbanization have led to an increasingly serious air pollution problem in Ulaanbaatar (UB), the capital of Mongolia. PM₂.₅ pollution has become the most pressing aspect of UB air pollution. Therefore, monitoring and predicting PM₂.₅ concentration in UB is of great significance for the health of the local people and environmental management. As of yet, very few studies have used models to predict PM₂.₅ concentrations in UB. Using data from 0:00 on June 1, 2018, to 23:00 on April 30, 2020, we proposed two deep learning models based on Bayesian-optimized LSTM (Bayes-LSTM) and CNN-LSTM. We utilized hourly observed data, including Himawari8 (H8) aerosol optical depth (AOD), meteorology, and PM₂.₅ concentration, as input for the prediction of PM₂.₅ concentrations. The correlation strengths between meteorology, AOD, and PM₂.₅ were analyzed using the gray correlation analysis method; the comparison of the performance improvement of the model by using the AOD input value was tested, and the performance of these models was evaluated using mean absolute error (MAE) and root mean square error (RMSE). The prediction accuracies of Bayes-LSTM and CNN-LSTM deep learning models were both improved when AOD was included as an input parameter. Improvement of the prediction accuracy of the CNN-LSTM model was particularly enhanced in the non-heating season; in the heating season, the prediction accuracy of the Bayes-LSTM model slightly improved, while the prediction accuracy of the CNN-LSTM model slightly decreased. We propose two novel deep learning models for PM₂.₅ concentration prediction in UB, Bayes-LSTM, and CNN-LSTM deep learning models. Pioneering the use of AOD data from H8 and demonstrating the inclusion of AOD input data improves the performance of our two proposed deep learning models.

Keywords: deep learning, AOD, PM2.5, prediction, Ulaanbaatar

Procedia PDF Downloads 13

52 Prediction of Distillation Curve and Reid Vapor Pressure of Dual-Alcohol Gasoline Blends Using Artificial Neural Network for the Determination of Fuel Performance

Authors: Leonard D. Agana, Wendell Ace Dela Cruz, Arjan C. Lingaya, Bonifacio T. Doma Jr.

Abstract:

The purpose of this paper is to study the predict the fuel performance parameters, which include drivability index (DI), vapor lock index (VLI), and vapor lock potential using distillation curve and Reid vapor pressure (RVP) of dual alcohol-gasoline fuel blends. Distillation curve and Reid vapor pressure were predicted using artificial neural networks (ANN) with macroscopic properties such as boiling points, RVP, and molecular weights as the input layers. The ANN consists of 5 hidden layers and was trained using Bayesian regularization. The training mean square error (MSE) and R-value for the ANN of RVP are 91.4113 and 0.9151, respectively, while the training MSE and R-value for the distillation curve are 33.4867 and 0.9927. Fuel performance analysis of the dual alcohol–gasoline blends indicated that highly volatile gasoline blended with dual alcohols results in non-compliant fuel blends with D4814 standard. Mixtures of low-volatile gasoline and 10% methanol or 10% ethanol can still be blended with up to 10% C3 and C4 alcohols. Intermediate volatile gasoline containing 10% methanol or 10% ethanol can still be blended with C3 and C4 alcohols that have low RVPs, such as 1-propanol, 1-butanol, 2-butanol, and i-butanol. Biography: Graduate School of Chemical, Biological, and Materials Engineering and Sciences, Mapua University, Muralla St., Intramuros, Manila, 1002, Philippines

Keywords: dual alcohol-gasoline blends, distillation curve, machine learning, reid vapor pressure

Procedia PDF Downloads 64

51 An Integrated Approach for Risk Management of Transportation of HAZMAT: Use of Quality Function Deployment and Risk Assessment

Authors: Guldana Zhigerbayeva, Ming Yang

Abstract:

Transportation of hazardous materials (HAZMAT) is inevitable in the process industries. The statistics show a significant number of accidents has occurred during the transportation of HAZMAT. This makes risk management of HAZMAT transportation an important topic. The tree-based methods including fault-trees, event-trees and cause-consequence analysis, and Bayesian network, have been applied to risk management of HAZMAT transportation. However, there is limited work on the development of a systematic approach. The existing approaches fail to build up the linkages between the regulatory requirements and the safety measures development. The analysis of historical data from the past accidents’ report databases would limit our focus on the specific incidents and their specific causes. Thus, we may overlook some essential elements in risk management, including regulatory compliance, field expert opinions, and suggestions. A systematic approach is needed to translate the regulatory requirements of HAZMAT transportation into specified safety measures (both technical and administrative) to support the risk management process. This study aims to first adapt the House of Quality (HoQ) to House of Safety (HoS) and proposes a new approach- Safety Function Deployment (SFD). The results of SFD will be used in a multi-criteria decision-support system to develop find an optimal route for HazMats transportation. The proposed approach will be demonstrated through a hypothetical transportation case in Kazakhstan.

Keywords: hazardous materials, risk assessment, risk management, quality function deployment

Procedia PDF Downloads 109

50 An Automatic Bayesian Classification System for File Format Selection

Authors: Roman Graf, Sergiu Gordea, Heather M. Ryan

Abstract:

This paper presents an approach for the classification of an unstructured format description for identification of file formats. The main contribution of this work is the employment of data mining techniques to support file format selection with just the unstructured text description that comprises the most important format features for a particular organisation. Subsequently, the file format indentification method employs file format classifier and associated configurations to support digital preservation experts with an estimation of required file format. Our goal is to make use of a format specification knowledge base aggregated from a different Web sources in order to select file format for a particular institution. Using the naive Bayes method, the decision support system recommends to an expert, the file format for his institution. The proposed methods facilitate the selection of file format and the quality of a digital preservation process. The presented approach is meant to facilitate decision making for the preservation of digital content in libraries and archives using domain expert knowledge and specifications of file formats. To facilitate decision-making, the aggregated information about the file formats is presented as a file format vocabulary that comprises most common terms that are characteristic for all researched formats. The goal is to suggest a particular file format based on this vocabulary for analysis by an expert. The sample file format calculation and the calculation results including probabilities are presented in the evaluation section.

Keywords: data mining, digital libraries, digital preservation, file format

Procedia PDF Downloads 471

49 Ambivalence as Ethical Practice: Methodologies to Address Noise, Bias in Care, and Contact Evaluations

Authors: Anthony Townsend, Robyn Fasser

Abstract:

While complete objectivity is a desirable scientific position from which to conduct a care and contact evaluation (CCE), it is precisely the recognition that we are inherently incapable of operating objectively that is the foundation of ethical practice and skilled assessment. Drawing upon recent research from Daniel Kahneman (2021) on the differences between noise and bias, as well as different inherent biases collectively termed “The Elephant in the Brain” by Kevin Simler and Robin Hanson (2019) from Oxford University, this presentation addresses both the various ways in which our judgments, perceptions and even procedures can be distorted and contaminated while conducting a CCE, but also considers the value of second order cybernetics and the psychodynamic concept of ‘ambivalence’ as a conceptual basis to inform our assessment methodologies to limit such errors or at least better identify them. Both a conceptual framework for ambivalence, our higher-order capacity to allow for the convergence and consideration of multiple emotional experiences and cognitive perceptions to inform our reasoning, and a practical methodology for assessment relying on data triangulation, Bayesian inference and hypothesis testing is presented as a means of promoting ethical practice for health care professionals conducting CCEs. An emphasis on widening awareness and perspective, limiting ‘splitting’, is demonstrated both in how this form of emotional processing plays out in alienating dynamics in families as well as the assessment thereof. In addressing this concept, this presentation aims to illuminate the value of ambivalence as foundational to ethical practice for assessors.

Keywords: ambivalence, forensic, psychology, noise, bias, ethics

Procedia PDF Downloads 63

48 Big Data in Telecom Industry: Effective Predictive Techniques on Call Detail Records

Authors: Sara ElElimy, Samir Moustafa

Abstract:

Mobile network operators start to face many challenges in the digital era, especially with high demands from customers. Since mobile network operators are considered a source of big data, traditional techniques are not effective with new era of big data, Internet of things (IoT) and 5G; as a result, handling effectively different big datasets becomes a vital task for operators with the continuous growth of data and moving from long term evolution (LTE) to 5G. So, there is an urgent need for effective Big data analytics to predict future demands, traffic, and network performance to full fill the requirements of the fifth generation of mobile network technology. In this paper, we introduce data science techniques using machine learning and deep learning algorithms: the autoregressive integrated moving average (ARIMA), Bayesian-based curve fitting, and recurrent neural network (RNN) are employed for a data-driven application to mobile network operators. The main framework included in models are identification parameters of each model, estimation, prediction, and final data-driven application of this prediction from business and network performance applications. These models are applied to Telecom Italia Big Data challenge call detail records (CDRs) datasets. The performance of these models is found out using a specific well-known evaluation criteria shows that ARIMA (machine learning-based model) is more accurate as a predictive model in such a dataset than the RNN (deep learning model).

Keywords: big data analytics, machine learning, CDRs, 5G

Procedia PDF Downloads 110

47 Determining of the Performance of Data Mining Algorithm Determining the Influential Factors and Prediction of Ischemic Stroke: A Comparative Study in the Southeast of Iran

Authors: Y. Mehdipour, S. Ebrahimi, A. Jahanpour, F. Seyedzaei, B. Sabayan, A. Karimi, H. Amirifard

Abstract:

Ischemic stroke is one of the common reasons for disability and mortality. The fourth leading cause of death in the world and the third in some other sources. Only 1/3 of the patients with ischemic stroke fully recover, 1/3 of them end in permanent disability and 1/3 face death. Thus, the use of predictive models to predict stroke has a vital role in reducing the complications and costs related to this disease. Thus, the aim of this study was to specify the effective factors and predict ischemic stroke with the help of DM methods. The present study was a descriptive-analytic study. The population was 213 cases from among patients referring to Ali ibn Abi Talib (AS) Hospital in Zahedan. Data collection tool was a checklist with the validity and reliability confirmed. This study used DM algorithms of decision tree for modeling. Data analysis was performed using SPSS-19 and SPSS Modeler 14.2. The results of the comparison of algorithms showed that CHAID algorithm with 95.7% accuracy has the best performance. Moreover, based on the model created, factors such as anemia, diabetes mellitus, hyperlipidemia, transient ischemic attacks, coronary artery disease, and atherosclerosis are the most effective factors in stroke. Decision tree algorithms, especially CHAID algorithm, have acceptable precision and predictive ability to determine the factors affecting ischemic stroke. Thus, by creating predictive models through this algorithm, will play a significant role in decreasing the mortality and disability caused by ischemic stroke.

Keywords: data mining, ischemic stroke, decision tree, Bayesian network

Procedia PDF Downloads 147

46 Price Effect Estimation of Tobacco on Low-wage Male Smokers: A Causal Mediation Analysis

Authors: Kawsar Ahmed, Hong Wang

Abstract:

The study's goal was to estimate the causal mediation impact of tobacco tax before and after price hikes among low-income male smokers, with a particular emphasis on the effect estimating pathways framework for continuous and dichotomous variables. From July to December 2021, a cross-sectional investigation of observational data (n=739) was collected from Bangladeshi low-wage smokers. The Quasi-Bayesian technique, binomial probit model, and sensitivity analysis using a simulation of the computational tools R mediation package had been used to estimate the effect. After a price rise for tobacco products, the average number of cigarettes or bidis sticks taken decreased from 6.7 to 4.56. Tobacco product rising prices have a direct effect on low-income people's decisions to quit or lessen their daily smoking habits of Average Causal Mediation Effect (ACME) [effect=2.31, 95 % confidence interval (C.I.) = (4.71-0.00), p<0.01], Average Direct Effect (ADE) [effect=8.6, 95 percent (C.I.) = (6.8-0.11), p<0.001], and overall significant effects (p<0.001). Tobacco smoking choice is described by the mediated proportion of income effect, which is 26.1% less of following price rise. The curve of ACME and ADE is based on observational figures of the coefficients of determination that asses the model of hypothesis as the substantial consequence after price rises in the sensitivity analysis. To reduce smoking product behaviors, price increases through taxation have a positive causal mediation with income that affects the decision to limit tobacco use and promote low-income men's healthcare policy.

Keywords: causal mediation analysis, directed acyclic graphs, tobacco price policy, sensitivity analysis, pathway estimation

Procedia PDF Downloads 84

45 Event Driven Dynamic Clustering and Data Aggregation in Wireless Sensor Network

Authors: Ashok V. Sutagundar, Sunilkumar S. Manvi

Abstract:

Energy, delay and bandwidth are the prime issues of wireless sensor network (WSN). Energy usage optimization and efficient bandwidth utilization are important issues in WSN. Event triggered data aggregation facilitates such optimal tasks for event affected area in WSN. Reliable delivery of the critical information to sink node is also a major challenge of WSN. To tackle these issues, we propose an event driven dynamic clustering and data aggregation scheme for WSN that enhances the life time of the network by minimizing redundant data transmission. The proposed scheme operates as follows: (1) Whenever the event is triggered, event triggered node selects the cluster head. (2) Cluster head gathers data from sensor nodes within the cluster. (3) Cluster head node identifies and classifies the events out of the collected data using Bayesian classifier. (4) Aggregation of data is done using statistical method. (5) Cluster head discovers the paths to the sink node using residual energy, path distance and bandwidth. (6) If the aggregated data is critical, cluster head sends the aggregated data over the multipath for reliable data communication. (7) Otherwise aggregated data is transmitted towards sink node over the single path which is having the more bandwidth and residual energy. The performance of the scheme is validated for various WSN scenarios to evaluate the effectiveness of the proposed approach in terms of aggregation time, cluster formation time and energy consumed for aggregation.

Keywords: wireless sensor network, dynamic clustering, data aggregation, wireless communication

Procedia PDF Downloads 414

44 The role of Financial Development and Institutional Quality in Promoting Sustainable Development through Tourism Management

Authors: Hashim Zameer

Abstract:

Effective tourism management plays a vital role in promoting sustainability and supporting ecosystems. A common principle that has been in practice over the years is “first pollute and then clean,” indicating countries need financial resources to promote sustainability. Financial development and the tourism management both seems very important to promoting sustainable development. However, without institutional support, it is very difficult to succeed. In this context, it seems prominently significant to explore how institutional quality, tourism development, and financial development could promote sustainable development. In the past, no research explored the role of tourism development in sustainable development. Moreover, the role of financial development, natural resources, and institutional quality in sustainable development is also ignored. In this regard, this paper aims to investigate the role of tourism development, natural resources, financial development, and institutional quality in sustainable development in China. The study used time-series data from 2000–2021 and employed the Bayesian linear regression model because it is suitable for small data sets. The robustness of the findings was checked using a quantile regression approach. The results reveal that an increase in tourism expenditures stimulates the economy, creates jobs, encourages cultural exchange, and supports sustainability initiatives. Moreover, financial development and institution quality have a positive effect on sustainable development. However, reliance on natural resources can result in negative economic, social, and environmental outcomes, highlighting the need for resource diversification and management to reinforce sustainable development. These results highlight the significance of financial development, strong institutions, sustainable tourism, and careful utilization of natural resources for long-term sustainability. The study holds vital insights for policy formulation to promote sustainable tourism.

Keywords: sustainability, tourism development, financial development, institutional quality

Procedia PDF Downloads 46

43 Bioinformatic Approaches in Population Genetics and Phylogenetic Studies

Authors: Masoud Sheidai

Abstract:

Biologists with a special field of population genetics and phylogeny have different research tasks such as populations’ genetic variability and divergence, species relatedness, the evolution of genetic and morphological characters, and identification of DNA SNPs with adaptive potential. To tackle these problems and reach a concise conclusion, they must use the proper and efficient statistical and bioinformatic methods as well as suitable genetic and morphological characteristics. In recent years application of different bioinformatic and statistical methods, which are based on various well-documented assumptions, are the proper analytical tools in the hands of researchers. The species delineation is usually carried out with the use of different clustering methods like K-means clustering based on proper distance measures according to the studied features of organisms. A well-defined species are assumed to be separated from the other taxa by molecular barcodes. The species relationships are studied by using molecular markers, which are analyzed by different analytical methods like multidimensional scaling (MDS) and principal coordinate analysis (PCoA). The species population structuring and genetic divergence are usually investigated by PCoA and PCA methods and a network diagram. These are based on bootstrapping of data. The Association of different genes and DNA sequences to ecological and geographical variables is determined by LFMM (Latent factor mixed model) and redundancy analysis (RDA), which are based on Bayesian and distance methods. Molecular and morphological differentiating characters in the studied species may be identified by linear discriminant analysis (DA) and discriminant analysis of principal components (DAPC). We shall illustrate these methods and related conclusions by giving examples from different edible and medicinal plant species.

Keywords: GWAS analysis, K-Means clustering, LFMM, multidimensional scaling, redundancy analysis

Procedia PDF Downloads 90

42 Molecular Identification and Evolutionary Status of Lucilia bufonivora: An Obligate Parasite of Amphibians in Europe

Authors: Gerardo Arias, Richard Wall, Jamie Stevens

Abstract:

Lucilia bufonivora Moniez, is an obligate parasite of toads and frogs widely distributed in Europe. Its sister taxon Lucilia silvarum Meigen behaves mainly as a carrion breeder in Europe, however it has been reported as a facultative parasite of amphibians. These two closely related species are morphologically almost identical, which has led to misidentification, and in fact, it has been suggested that the amphibian myiasis cases by L. silvarum reported in Europe should be attributed to L. bufonivora. Both species remain poorly studied and their taxonomic relationships are still unclear. The identification of the larval specimens involved in amphibian myiasis with molecular tools and phylogenetic analysis of these two closely related species may resolve this problem. In this work seventeen unidentified larval specimens extracted from toad myiasis cases of the UK, the Netherlands and Switzerland were obtained, their COX1 (mtDNA) and EF1-α (Nuclear DNA) gene regions were amplified and then sequenced. The 17 larval samples were identified with both molecular markers as L. bufonivora. Phylogenetic analysis was carried out with 10 other blowfly species, including L. silvarum samples from the UK and USA. Bayesian Inference trees of COX1 and a combined-gene dataset suggested that L. silvarum and L. bufonivora are separate sister species. However, the nuclear gene EF1-α does not appear to resolve their relationships, suggesting that the rates of evolution of the mtDNA are much faster than those of the nuclear DNA. This work provides the molecular evidence for successful identification of L. bufonivora and a molecular analysis of the populations of this obligate parasite from different locations across Europe. The relationships with L. silvarum are discussed.

Keywords: calliphoridae, molecular evolution, myiasis, obligate parasitism

Procedia PDF Downloads 204

41 An Ensemble System of Classifiers for Computer-Aided Volcano Monitoring

Authors: Flavio Cannavo

Abstract:

Continuous evaluation of the status of potentially hazardous volcanos plays a key role for civil protection purposes. The importance of monitoring volcanic activity, especially for energetic paroxysms that usually come with tephra emissions, is crucial not only for exposures to the local population but also for airline traffic. Presently, real-time surveillance of most volcanoes worldwide is essentially delegated to one or more human experts in volcanology, who interpret data coming from different kind of monitoring networks. Unfavorably, the high nonlinearity of the complex and coupled volcanic dynamics leads to a large variety of different volcanic behaviors. Moreover, continuously measured parameters (e.g. seismic, deformation, infrasonic and geochemical signals) are often not able to fully explain the ongoing phenomenon, thus making the fast volcano state assessment a very puzzling task for the personnel on duty at the control rooms. With the aim of aiding the personnel on duty in volcano surveillance, here we introduce a system based on an ensemble of data-driven classifiers to infer automatically the ongoing volcano status from all the available different kind of measurements. The system consists of a heterogeneous set of independent classifiers, each one built with its own data and algorithm. Each classifier gives an output about the volcanic status. The ensemble technique allows weighting the single classifier output to combine all the classifications into a single status that maximizes the performance. We tested the model on the Mt. Etna (Italy) case study by considering a long record of multivariate data from 2011 to 2015 and cross-validated it. Results indicate that the proposed model is effective and of great power for decision-making purposes.

Keywords: Bayesian networks, expert system, mount Etna, volcano monitoring

Procedia PDF Downloads 218

40 Reliability Qualification Test Plan Derivation Method for Weibull Distributed Products

Authors: Ping Jiang, Yunyan Xing, Dian Zhang, Bo Guo

Abstract:

The reliability qualification test (RQT) is widely used in product development to qualify whether the product meets predetermined reliability requirements, which are mainly described in terms of reliability indices, for example, MTBF (Mean Time Between Failures). It is widely exercised in product development. In engineering practices, RQT plans are mandatorily referred to standards, such as MIL-STD-781 or GJB899A-2009. But these conventional RQT plans in standards are not preferred, as the test plans often require long test times or have high risks for both producer and consumer due to the fact that the methods in the standards only use the test data of the product itself. And the standards usually assume that the product is exponentially distributed, which is not suitable for a complex product other than electronics. So it is desirable to develop an RQT plan derivation method that safely shortens test time while keeping the two risks under control. To meet this end, for the product whose lifetime follows Weibull distribution, an RQT plan derivation method is developed. The merit of the method is that expert judgment is taken into account. This is implemented by applying the Bayesian method, which translates the expert judgment into prior information on product reliability. Then producer’s risk and the consumer’s risk are calculated accordingly. The procedures to derive RQT plans are also proposed in this paper. As extra information and expert judgment are added to the derivation, the derived test plans have the potential to shorten the required test time and have satisfactory low risks for both producer and consumer, compared with conventional test plans. A case study is provided to prove that when using expert judgment in deriving product test plans, the proposed method is capable of finding ideal test plans that not only reduce the two risks but also shorten the required test time as well.

Keywords: expert judgment, reliability qualification test, test plan derivation, producer’s risk, consumer’s risk

Procedia PDF Downloads 92

39 Artificial Intelligence Techniques for Enhancing Supply Chain Resilience: A Systematic Literature Review, Holistic Framework, and Future Research

Authors: Adane Kassa Shikur

Abstract:

Today’s supply chains (SC) have become vulnerable to unexpected and ever-intensifying disruptions from myriad sources. Consequently, the concept of supply chain resilience (SCRes) has become crucial to complement the conventional risk management paradigm, which has failed to cope with unexpected SC disruptions, resulting in severe consequences affecting SC performances and making business continuity questionable. Advancements in cutting-edge technologies like artificial intelligence (AI) and their potential to enhance SCRes by improving critical antecedents in the different phases have attracted the attention of scholars and practitioners. The research from academia and the practical interest of the industry have yielded significant publications at the nexus of AI and SCRes during the last two decades. However, the applications and examinations have been primarily conducted independently, and the extant literature is dispersed into research streams despite the complex nature of SCRes. To close this research gap, this study conducts a systematic literature review of 106 peer-reviewed articles by curating, synthesizing, and consolidating up-to-date literature and presents the state-of-the-art development from 2010 to 2022. Bayesian networks are the most topical ones among the 13 AI techniques evaluated. Concerning the critical antecedents, visibility is the first ranking to be realized by the techniques. The study revealed that AI techniques support only the first 3 phases of SCRes (readiness, response, and recovery), and readiness is the most popular one, while no evidence has been found for the growth phase. The study proposed an AI-SCRes framework to inform research and practice to approach SCRes holistically. It also provided implications for practice, policy, and theory as well as gaps for impactful future research.

Keywords: ANNs, risk, Bauesian networks, vulnerability, resilience

Procedia PDF Downloads 44

38 Forecasting Lake Malawi Water Level Fluctuations Using Stochastic Models

Authors: M. Mulumpwa, W. W. L. Jere, M. Lazaro, A. H. N. Mtethiwa

Abstract:

The study considered Seasonal Autoregressive Integrated Moving Average (SARIMA) processes to select an appropriate stochastic model to forecast the monthly data from the Lake Malawi water levels for the period 1986 through 2015. The appropriate model was chosen based on SARIMA (p, d, q) (P, D, Q)S. The Autocorrelation function (ACF), Partial autocorrelation (PACF), Akaike Information Criteria (AIC), Bayesian Information Criterion (BIC), Box–Ljung statistics, correlogram and distribution of residual errors were estimated. The SARIMA (1, 1, 0) (1, 1, 1)12 was selected to forecast the monthly data of the Lake Malawi water levels from August, 2015 to December, 2021. The plotted time series showed that the Lake Malawi water levels are decreasing since 2010 to date but not as much as was the case in 1995 through 1997. The future forecast of the Lake Malawi water levels until 2021 showed a mean of 474.47 m ranging from 473.93 to 475.02 meters with a confidence interval of 80% and 90% against registered mean of 473.398 m in 1997 and 475.475 m in 1989 which was the lowest and highest water levels in the lake respectively since 1986. The forecast also showed that the water levels of Lake Malawi will drop by 0.57 meters as compared to the mean water levels recorded in the previous years. These results suggest that the Lake Malawi water level may not likely go lower than that recorded in 1997. Therefore, utilisation and management of water-related activities and programs among others on the lake should provide room for such scenarios. The findings suggest a need to manage the Lake Malawi jointly and prudently with other stakeholders starting from the catchment area. This will reduce impacts of anthropogenic activities on the lake’s water quality, water level, aquatic and adjacent terrestrial ecosystems thereby ensuring its resilience to climate change impacts.

Keywords: forecasting, Lake Malawi, water levels, water level fluctuation, climate change, anthropogenic activities

Procedia PDF Downloads 197

37 A Bayesian Classification System for Facilitating an Institutional Risk Profile Definition

Authors: Roman Graf, Sergiu Gordea, Heather M. Ryan

Abstract:

This paper presents an approach for easy creation and classification of institutional risk profiles supporting endangerment analysis of file formats. The main contribution of this work is the employment of data mining techniques to support set up of the most important risk factors. Subsequently, risk profiles employ risk factors classifier and associated configurations to support digital preservation experts with a semi-automatic estimation of endangerment group for file format risk profiles. Our goal is to make use of an expert knowledge base, accuired through a digital preservation survey in order to detect preservation risks for a particular institution. Another contribution is support for visualisation of risk factors for a requried dimension for analysis. Using the naive Bayes method, the decision support system recommends to an expert the matching risk profile group for the previously selected institutional risk profile. The proposed methods improve the visibility of risk factor values and the quality of a digital preservation process. The presented approach is designed to facilitate decision making for the preservation of digital content in libraries and archives using domain expert knowledge and values of file format risk profiles. To facilitate decision-making, the aggregated information about the risk factors is presented as a multidimensional vector. The goal is to visualise particular dimensions of this vector for analysis by an expert and to define its profile group. The sample risk profile calculation and the visualisation of some risk factor dimensions is presented in the evaluation section.

Keywords: linked open data, information integration, digital libraries, data mining

Procedia PDF Downloads 397

36 Improving 99mTc-tetrofosmin Myocardial Perfusion Images by Time Subtraction Technique

Authors: Yasuyuki Takahashi, Hayato Ishimura, Masao Miyagawa, Teruhito Mochizuki

Abstract:

Quantitative measurement of myocardium perfusion is possible with single photon emission computed tomography (SPECT) using a semiconductor detector. However, accumulation of ^99mTc-tetrofosmin in the liver may make it difficult to assess that accurately in the inferior myocardium. Our idea is to reduce the high accumulation in the liver by using dynamic SPECT imaging and a technique called time subtraction. We evaluated the performance of a new SPECT system with a cadmium-zinc-telluride solid-state semi- conductor detector (Discovery NM 530c; GE Healthcare). Our system acquired list-mode raw data over 10 minutes for a typical patient. From the data, ten SPECT images were reconstructed, one for every minute of acquired data. Reconstruction with the semiconductor detector was based on an implementation of a 3-D iterative Bayesian reconstruction algorithm. We studied 20 patients with coronary artery disease (mean age 75.4 ± 12.1 years; range 42-86; 16 males and 4 females). In each subject, 259 MBq of ^99mTc-tetrofosmin was injected intravenously. We performed both a phantom and a clinical study using dynamic SPECT. An approximation to a liver-only image is obtained by reconstructing an image from the early projections during which time the liver accumulation dominates (0.5~2.5 minutes SPECT image-5~10 minutes SPECT image). The extracted liver-only image is then subtracted from a later SPECT image that shows both the liver and the myocardial uptake (5~10 minutes SPECT image-liver-only image). The time subtraction of liver was possible in both a phantom and the clinical study. The visualization of the inferior myocardium was improved. In past reports, higher accumulation in the myocardium due to the overlap of the liver is un-diagnosable. Using our time subtraction method, the image quality of the ^99mTc-tetorofosmin myocardial SPECT image is considerably improved.

Keywords: 99mTc-tetrofosmin, dynamic SPECT, time subtraction, semiconductor detector

Procedia PDF Downloads 298

35 Comparison of Different Artificial Intelligence-Based Protein Secondary Structure Prediction Methods

Authors: Jamerson Felipe Pereira Lima, Jeane Cecília Bezerra de Melo

Abstract:

The difficulty and cost related to obtaining of protein tertiary structure information through experimental methods, such as X-ray crystallography or NMR spectroscopy, helped raising the development of computational methods to do so. An approach used in these last is prediction of tridimensional structure based in the residue chain, however, this has been proved an NP-hard problem, due to the complexity of this process, explained by the Levinthal paradox. An alternative solution is the prediction of intermediary structures, such as the secondary structure of the protein. Artificial Intelligence methods, such as Bayesian statistics, artificial neural networks (ANN), support vector machines (SVM), among others, were used to predict protein secondary structure. Due to its good results, artificial neural networks have been used as a standard method to predict protein secondary structure. Recent published methods that use this technique, in general, achieved a Q3 accuracy between 75% and 83%, whereas the theoretical accuracy limit for protein prediction is 88%. Alternatively, to achieve better results, support vector machines prediction methods have been developed. The statistical evaluation of methods that use different AI techniques, such as ANNs and SVMs, for example, is not a trivial problem, since different training sets, validation techniques, as well as other variables can influence the behavior of a prediction method. In this study, we propose a prediction method based on artificial neural networks, which is then compared with a selected SVM method. The chosen SVM protein secondary structure prediction method is the one proposed by Huang in his work Extracting Physico chemical Features to Predict Protein Secondary Structure (2013). The developed ANN method has the same training and testing process that was used by Huang to validate his method, which comprises the use of the CB513 protein data set and three-fold cross-validation, so that the comparative analysis of the results can be made comparing directly the statistical results of each method.

Keywords: artificial neural networks, protein secondary structure, protein structure prediction, support vector machines

Procedia PDF Downloads 582

34 Developing a DNN Model for the Production of Biogas From a Hybrid BO-TPE System in an Anaerobic Wastewater Treatment Plant

Authors: Hadjer Sadoune, Liza Lamini, Scherazade Krim, Amel Djouadi, Rachida Rihani

Abstract:

Deep neural networks are highly regarded for their accuracy in predicting intricate fermentation processes. Their ability to learn from a large amount of datasets through artificial intelligence makes them particularly effective models. The primary obstacle in improving the performance of these models is to carefully choose the suitable hyperparameters, including the neural network architecture (number of hidden layers and hidden units), activation function, optimizer, learning rate, and other relevant factors. This study predicts biogas production from real wastewater treatment plant data using a sophisticated approach: hybrid Bayesian optimization with a tree-structured Parzen estimator (BO-TPE) for an optimised deep neural network (DNN) model. The plant utilizes an Upflow Anaerobic Sludge Blanket (UASB) digester that treats industrial wastewater from soft drinks and breweries. The digester has a working volume of 1574 m3 and a total volume of 1914 m3. Its internal diameter and height were 19 and 7.14 m, respectively. The data preprocessing was conducted with meticulous attention to preserving data quality while avoiding data reduction. Three normalization techniques were applied to the pre-processed data (MinMaxScaler, RobustScaler and StandardScaler) and compared with the Non-Normalized data. The RobustScaler approach has strong predictive ability for estimating the volume of biogas produced. The highest predicted biogas volume was 2236.105 Nm³/d, with coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE) values of 0.712, 164.610, and 223.429, respectively.

Keywords: anaerobic digestion, biogas production, deep neural network, hybrid bo-tpe, hyperparameters tuning

Procedia PDF Downloads 8

33 Socio-Demographic Factors and Testing Practices Are Associated with Spatial Patterns of Clostridium difficile Infection in the Australian Capital Territory, 2004-2014

Authors: Aparna Lal, Ashwin Swaminathan, Teisa Holani

Abstract:

Background: Clostridium difficile infections (CDIs) have been on the rise globally. In Australia, rates of CDI in all States and Territories have increased significantly since mid-2011. Identifying risk factors for CDI in the community can help inform targeted interventions to reduce infection. Methods: We examine the role of neighbourhood socio-economic status, demography, testing practices and the number of residential aged care facilities on spatial patterns in CDI incidence in the Australian Capital Territory. Data on all tests conducted for CDI were obtained from ACT Pathology by postcode for the period 1st January 2004 through 31 December 2014. Distribution of age groups and the neighbourhood Index of Relative Socio-economic Advantage Disadvantage (IRSAD) were obtained from the Australian Bureau of Statistics 2011 National Census data. A Bayesian spatial conditional autoregressive model was fitted at the postcode level to quantify the relationship between CDI and socio-demographic factors. To identify CDI hotspots, exceedance probabilities were set at a threshold of twice the estimated relative risk. Results: CDI showed a positive spatial association with the number of tests (RR=1.01, 95% CI 1.00, 1.02) and the resident population over 65 years (RR=1.00, 95% CI 1.00, 1.01). The standardized index of relative socio-economic advantage disadvantage (IRSAD) was significantly negatively associated with CDI (RR=0.74, 95% CI 0.56, 0.94). We identified three postcodes with high probability (0.8-1.0) of excess risk. Conclusions: Here, we demonstrate geographic variations in CDI in the ACT with a positive association of CDI with socioeconomic disadvantage and identify areas with a high probability of elevated risk compared with surrounding communities. These findings highlight community-based risk factors for CDI.

Keywords: spatial, socio-demographic, infection, Clostridium difficile

Procedia PDF Downloads 292

32 Phylogenetic Analysis Based On the Internal Transcribed Spacer-2 (ITS2) Sequences of Diadegma semiclausum (Hymenoptera: Ichneumonidae) Populations Reveals Significant Adaptive Evolution

Authors: Ebraheem Al-Jouri, Youssef Abu-Ahmad, Ramasamy Srinivasan

Abstract:

The parasitoid, Diadegma semiclausum (Hymenoptera: Ichneumonidae) is one of the most effective exotic parasitoids of diamondback moth (DBM), Plutella xylostella in the lowland areas of Homs, Syria. Molecular evolution studies are useful tools to shed light on the molecular bases of insect geographical spread and adaptation to new hosts and environment and for designing better control strategies. In this study, molecular evolution analysis was performed based on the 42 nuclear internal transcribed spacer-2 (ITS2) sequences representing the D. semiclausum and eight other Diadegma spp. from Syria and worldwide. Possible recombination events were identified by RDP4 program. Four potential recombinants of the American D. insulare and D. fenestrale (Jeju) were detected. After detecting and removing recombinant sequences, the ratio of non-synonymous (dN) to synonymous (dS) substitutions per site (dN/dS=ɷ) has been used to identify codon positions involved in adaptive processes. Bayesian techniques were applied to detect selective pressures at a codon level by using five different approaches including: fixed effects likelihood (FEL), internal fixed effects likelihood (IFEL), random effects method (REL), mixed effects model of evolution (MEME) and Program analysis of maximum liklehood (PAML). Among the 40 positively selected amino acids (aa) that differed significantly between clades of Diadegma species, three aa under positive selection were only identified in D. semiclausum. Additionally, all D. semiclausum branches tree were highly found under episodic diversifying selection (EDS) at p≤0.05. Our study provide evidence that both recombination and positive selection have contributed to the molecular diversity of Diadegma spp. and highlights the significant contribution of D. semiclausum in adaptive evolution and influence the fitness in the DBM parasitoid.

Keywords: diadegma sp, DBM, ITS2, phylogeny, recombination, dN/dS, evolution, positive selection

Procedia PDF Downloads 390

31 Introducing Two Species of Parastagonospora (Phaeosphaeriaceae) on Grasses from Italy and Russia, Based on Morphology and Phylogeny

Authors: Ishani D. Goonasekara, Erio Camporesi, Timur Bulgakov, Rungtiwa Phookamsak, Kevin D. Hyde

Abstract:

Phaeosphaeriaceae comprises a large number of species occurring mainly on grasses and cereal crops as endophytes, saprobes and especially pathogens. Parastagonospora is an important genus in Phaeosphaeriaceae that includes pathogens causing leaf and glume blotch on cereal crops. Currently, there are fifteen Parastagonospora species described, including both pathogens and saprobes. In this study, one sexual morph species and an asexual morph species, occurring as saprobes on members of Poaceae are introduced based on morphology and a combined molecular analysis of the LSU, SSU, ITS, and RPB2 gene sequence data. The sexual morph species Parastagonospora elymi was isolated from a Russian sample of Elymus repens, a grass commonly known as couch grass, and important for grazing animals, as a weed and used in traditional Austrian medicine. P. elymi is similar to the sexual morph of P. avenae in having cylindrical asci, bearing 8, overlapping biseriate, fusiform ascospores but can be distinguished by its subglobose to conical shaped, wider ascomata. In addition, no sheath was observed surrounding the ascospores. The asexual morph species was isolated from a specimen from Italy, on Dactylis glomerata, a commonly found grass distributed in temperate regions. It is introduced as Parastagonospora macrouniseptata, a coelomycete, and bears a close resemblance to P. allouniseptata and P. uniseptata in having globose to subglobose, pycnidial conidiomata and hyaline, cylindrical, 1-septate conidia. However, the new species could be distinguished in having much larger conidiomata. In the phylogenetic analysis which consisted of a maximum likelihood and Bayesian analysis P. elymi showed low bootstrap support, but well segregated from other strains within the Parastagonospora clade. P. neoallouniseptata formed a sister clade with P. allouniseptata with high statistical support.

Keywords: dothideomycetes, multi-gene analysis, Poaceae, saprobes, taxonomy

Procedia PDF Downloads 91

30 Use of SUDOKU Design to Assess the Implications of the Block Size and Testing Order on Efficiency and Precision of Dulce De Leche Preference Estimation

Authors: Jéssica Ferreira Rodrigues, Júlio Silvio De Sousa Bueno Filho, Vanessa Rios De Souza, Ana Carla Marques Pinheiro

Abstract:

This study aimed to evaluate the implications of the block size and testing order on efficiency and precision of preference estimation for Dulce de leche samples. Efficiency was defined as the inverse of the average variance of pairwise comparisons among treatments. Precision was defined as the inverse of the variance of treatment means (or effects) estimates. The experiment was originally designed to test 16 treatments as a series of 8 Sudoku 16x16 designs being 4 randomized independently and 4 others in the reverse order, to yield balance in testing order. Linear mixed models were assigned to the whole experiment with 112 testers and all their grades, as well as their partially balanced subgroups, namely: a) experiment with the four initial EU; b) experiment with EU 5 to 8; c) experiment with EU 9 to 12; and b) experiment with EU 13 to 16. To record responses we used a nine-point hedonic scale, it was assumed a mixed linear model analysis with random tester and treatments effects and with fixed test order effect. Analysis of a cumulative random effects probit link model was very similar, with essentially no different conclusions and for simplicity, we present the results using Gaussian assumption. R-CRAN library lme4 and its function lmer (Fit Linear Mixed-Effects Models) was used for the mixed models and libraries Bayesthresh (default Gaussian threshold function) and ordinal with the function clmm (Cumulative Link Mixed Model) was used to check Bayesian analysis of threshold models and cumulative link probit models. It was noted that the number of samples tested in the same session can influence the acceptance level, underestimating the acceptance. However, proving a large number of samples can help to improve the samples discrimination.

Keywords: acceptance, block size, mixed linear model, testing order, testing order

Procedia PDF Downloads 294

29 Exploring the Applications of Neural Networks in the Adaptive Learning Environment

Authors: Baladitya Swaika, Rahul Khatry

Abstract:

Computer Adaptive Tests (CATs) is one of the most efficient ways for testing the cognitive abilities of students. CATs are based on Item Response Theory (IRT) which is based on item selection and ability estimation using statistical methods of maximum information selection/selection from posterior and maximum-likelihood (ML)/maximum a posteriori (MAP) estimators respectively. This study aims at combining both classical and Bayesian approaches to IRT to create a dataset which is then fed to a neural network which automates the process of ability estimation and then comparing it to traditional CAT models designed using IRT. This study uses python as the base coding language, pymc for statistical modelling of the IRT and scikit-learn for neural network implementations. On creation of the model and on comparison, it is found that the Neural Network based model performs 7-10% worse than the IRT model for score estimations. Although performing poorly, compared to the IRT model, the neural network model can be beneficially used in back-ends for reducing time complexity as the IRT model would have to re-calculate the ability every-time it gets a request whereas the prediction from a neural network could be done in a single step for an existing trained Regressor. This study also proposes a new kind of framework whereby the neural network model could be used to incorporate feature sets, other than the normal IRT feature set and use a neural network’s capacity of learning unknown functions to give rise to better CAT models. Categorical features like test type, etc. could be learnt and incorporated in IRT functions with the help of techniques like logistic regression and can be used to learn functions and expressed as models which may not be trivial to be expressed via equations. This kind of a framework, when implemented would be highly advantageous in psychometrics and cognitive assessments. This study gives a brief overview as to how neural networks can be used in adaptive testing, not only by reducing time-complexity but also by being able to incorporate newer and better datasets which would eventually lead to higher quality testing.

Keywords: computer adaptive tests, item response theory, machine learning, neural networks

Procedia PDF Downloads 153

28 Modelling Volatility Spillovers and Cross Hedging among Major Agricultural Commodity Futures

Authors: Roengchai Tansuchat, Woraphon Yamaka, Paravee Maneejuk

Abstract:

From the past recent, the global financial crisis, economic instability, and large fluctuation in agricultural commodity price have led to increased concerns about the volatility transmission among them. The problem is further exacerbated by commodities volatility caused by other commodity price fluctuations, hence the decision on hedging strategy has become both costly and useless. Thus, this paper is conducted to analysis the volatility spillover effect among major agriculture including corn, soybeans, wheat and rice, to help the commodity suppliers hedge their portfolios, and manage the risk and co-volatility of them. We provide a switching regime approach to analyzing the issue of volatility spillovers in different economic conditions, namely upturn and downturn economic. In particular, we investigate relationships and volatility transmissions between these commodities in different economic conditions. We purposed a Copula-based multivariate Markov Switching GARCH model with two regimes that depend on an economic conditions and perform simulation study to check the accuracy of our proposed model. In this study, the correlation term in the cross-hedge ratio is obtained from six copula families – two elliptical copulas (Gaussian and Student-t) and four Archimedean copulas (Clayton, Gumbel, Frank, and Joe). We use one-step maximum likelihood estimation techniques to estimate our models and compare the performance of these copula using Akaike information criterion (AIC) and Bayesian information criteria (BIC). In the application study of agriculture commodities, the weekly data used are conducted from 4 January 2005 to 1 September 2016, covering 612 observations. The empirical results indicate that the volatility spillover effects among cereal futures are different, as response of different economic condition. In addition, the results of hedge effectiveness will also suggest the optimal cross hedge strategies in different economic condition especially upturn and downturn economic.

Keywords: agricultural commodity futures, cereal, cross-hedge, spillover effect, switching regime approach

Procedia PDF Downloads 169

27 Machine Learning in Agriculture: A Brief Review

Authors: Aishi Kundu, Elhan Raza

Abstract:

"Necessity is the mother of invention" - Rapid increase in the global human population has directed the agricultural domain toward machine learning. The basic need of human beings is considered to be food which can be satisfied through farming. Farming is one of the major revenue generators for the Indian economy. Agriculture is not only considered a source of employment but also fulfils humans’ basic needs. So, agriculture is considered to be the source of employment and a pillar of the economy in developing countries like India. This paper provides a brief review of the progress made in implementing Machine Learning in the agricultural sector. Accurate predictions are necessary at the right time to boost production and to aid the timely and systematic distribution of agricultural commodities to make their availability in the market faster and more effective. This paper includes a thorough analysis of various machine learning algorithms applied in different aspects of agriculture (crop management, soil management, water management, yield tracking, livestock management, etc.).Due to climate changes, crop production is affected. Machine learning can analyse the changing patterns and come up with a suitable approach to minimize loss and maximize yield. Machine Learning algorithms/ models (regression, support vector machines, bayesian models, artificial neural networks, decision trees, etc.) are used in smart agriculture to analyze and predict specific outcomes which can be vital in increasing the productivity of the Agricultural Food Industry. It is to demonstrate vividly agricultural works under machine learning to sensor data. Machine Learning is the ongoing technology benefitting farmers to improve gains in agriculture and minimize losses. This paper discusses how the irrigation and farming management systems evolve in real-time efficiently. Artificial Intelligence (AI) enabled programs to emerge with rich apprehension for the support of farmers with an immense examination of data.

Keywords: machine Learning, artificial intelligence, crop management, precision farming, smart farming, pre-harvesting, harvesting, post-harvesting

Procedia PDF Downloads 76