Search results for: Naïve Bayesian
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 453

Search results for: Naïve Bayesian

93 An Integrated Approach for Risk Management of Transportation of HAZMAT: Use of Quality Function Deployment and Risk Assessment

Authors: Guldana Zhigerbayeva, Ming Yang

Abstract:

Transportation of hazardous materials (HAZMAT) is inevitable in the process industries. The statistics show a significant number of accidents has occurred during the transportation of HAZMAT. This makes risk management of HAZMAT transportation an important topic. The tree-based methods including fault-trees, event-trees and cause-consequence analysis, and Bayesian network, have been applied to risk management of HAZMAT transportation. However, there is limited work on the development of a systematic approach. The existing approaches fail to build up the linkages between the regulatory requirements and the safety measures development. The analysis of historical data from the past accidents’ report databases would limit our focus on the specific incidents and their specific causes. Thus, we may overlook some essential elements in risk management, including regulatory compliance, field expert opinions, and suggestions. A systematic approach is needed to translate the regulatory requirements of HAZMAT transportation into specified safety measures (both technical and administrative) to support the risk management process. This study aims to first adapt the House of Quality (HoQ) to House of Safety (HoS) and proposes a new approach- Safety Function Deployment (SFD). The results of SFD will be used in a multi-criteria decision-support system to develop find an optimal route for HazMats transportation. The proposed approach will be demonstrated through a hypothetical transportation case in Kazakhstan.

Keywords: hazardous materials, risk assessment, risk management, quality function deployment

Procedia PDF Downloads 141
92 A Replicon-Baculovirus Model for Efficient Packaging of Hepatitis E Virus RNA and Production of Infectious Virions

Authors: Mohammad K. Parvez, Mohammed S. Al-Dosari

Abstract:

Hepatitis E virus (HEV) is an emerging RNA virus that causes acute and chronic liver disease with a global mortality rate of about 2%. Despite milestone developments in understanding of HEV biology, there is still lack of a robust culture system or animal model. Therefore, in a novel approach, two recombinant-baculoviruses (vBac-ORF2 and vBac-ORF3) that could overexpress HEV ORF2 (structural/capsid) and ORF3 (nonstructural/regulatory) proteins, respectively were constructed. The established HEV-SAR55 (genotype 1) replicon that contained GFP gene, in place of ORF2/ORF3 sequences was in vitro transcribed, and GFP production in RNA transfected S10-3 cells was scored by FACS. Enhanced infectivity, if any, of nascent virions produced by exogenously-supplied ORF2 and viral RNA by co-expression of ORF3 was tested on naïve HepG2 cells. Co-transduction with vBac-ORF2/vBac-ORF3 (108 pfu/microL) produced high amounts of native ORF2/ORF3 in approximately 60% of S10-3 cells, determined by immunofluorescence microscopy and Western analysis. FACS analysis showed about 9% GFP positivity of S10-3 cells on day6 post-transfection (i.e, day5 post-transduction). Further, FACS scoring indicated that lysates from S10-3 cultures receiving the RNA plus vBac-ORF2 were capable of producing HEV particles with about 4% infectivity in HepG2 cells. However, lysates of cultures co-transduced with vBac-ORF3, were found to further enhance virion infectivity by approximately 17%. This supported a previously proposed role of ORF3 as a minor-structural protein in HEV virion assembly and infectivity. In conclusion, the present model for efficient genomic RNA packaging and production of infectious virions could be a valuable tool to study various aspects of HEV molecular biology, in vitro.

Keywords: chronic liver disease, hepatitis E virus, ORF2, ORF3, replicon

Procedia PDF Downloads 255
91 Evaluation of Classification Algorithms for Diagnosis of Asthma in Iranian Patients

Authors: Taha SamadSoltani, Peyman Rezaei Hachesu, Marjan GhaziSaeedi, Maryam Zolnoori

Abstract:

Introduction: Data mining defined as a process to find patterns and relationships along data in the database to build predictive models. Application of data mining extended in vast sectors such as the healthcare services. Medical data mining aims to solve real-world problems in the diagnosis and treatment of diseases. This method applies various techniques and algorithms which have different accuracy and precision. The purpose of this study was to apply knowledge discovery and data mining techniques for the diagnosis of asthma based on patient symptoms and history. Method: Data mining includes several steps and decisions should be made by the user which starts by creation of an understanding of the scope and application of previous knowledge in this area and identifying KD process from the point of view of the stakeholders and finished by acting on discovered knowledge using knowledge conducting, integrating knowledge with other systems and knowledge documenting and reporting.in this study a stepwise methodology followed to achieve a logical outcome. Results: Sensitivity, Specifity and Accuracy of KNN, SVM, Naïve bayes, NN, Classification tree and CN2 algorithms and related similar studies was evaluated and ROC curves were plotted to show the performance of the system. Conclusion: The results show that we can accurately diagnose asthma, approximately ninety percent, based on the demographical and clinical data. The study also showed that the methods based on pattern discovery and data mining have a higher sensitivity compared to expert and knowledge-based systems. On the other hand, medical guidelines and evidence-based medicine should be base of diagnostics methods, therefore recommended to machine learning algorithms used in combination with knowledge-based algorithms.

Keywords: asthma, datamining, classification, machine learning

Procedia PDF Downloads 447
90 Ambivalence as Ethical Practice: Methodologies to Address Noise, Bias in Care, and Contact Evaluations

Authors: Anthony Townsend, Robyn Fasser

Abstract:

While complete objectivity is a desirable scientific position from which to conduct a care and contact evaluation (CCE), it is precisely the recognition that we are inherently incapable of operating objectively that is the foundation of ethical practice and skilled assessment. Drawing upon recent research from Daniel Kahneman (2021) on the differences between noise and bias, as well as different inherent biases collectively termed “The Elephant in the Brain” by Kevin Simler and Robin Hanson (2019) from Oxford University, this presentation addresses both the various ways in which our judgments, perceptions and even procedures can be distorted and contaminated while conducting a CCE, but also considers the value of second order cybernetics and the psychodynamic concept of ‘ambivalence’ as a conceptual basis to inform our assessment methodologies to limit such errors or at least better identify them. Both a conceptual framework for ambivalence, our higher-order capacity to allow for the convergence and consideration of multiple emotional experiences and cognitive perceptions to inform our reasoning, and a practical methodology for assessment relying on data triangulation, Bayesian inference and hypothesis testing is presented as a means of promoting ethical practice for health care professionals conducting CCEs. An emphasis on widening awareness and perspective, limiting ‘splitting’, is demonstrated both in how this form of emotional processing plays out in alienating dynamics in families as well as the assessment thereof. In addressing this concept, this presentation aims to illuminate the value of ambivalence as foundational to ethical practice for assessors.

Keywords: ambivalence, forensic, psychology, noise, bias, ethics

Procedia PDF Downloads 86
89 Big Data in Telecom Industry: Effective Predictive Techniques on Call Detail Records

Authors: Sara ElElimy, Samir Moustafa

Abstract:

Mobile network operators start to face many challenges in the digital era, especially with high demands from customers. Since mobile network operators are considered a source of big data, traditional techniques are not effective with new era of big data, Internet of things (IoT) and 5G; as a result, handling effectively different big datasets becomes a vital task for operators with the continuous growth of data and moving from long term evolution (LTE) to 5G. So, there is an urgent need for effective Big data analytics to predict future demands, traffic, and network performance to full fill the requirements of the fifth generation of mobile network technology. In this paper, we introduce data science techniques using machine learning and deep learning algorithms: the autoregressive integrated moving average (ARIMA), Bayesian-based curve fitting, and recurrent neural network (RNN) are employed for a data-driven application to mobile network operators. The main framework included in models are identification parameters of each model, estimation, prediction, and final data-driven application of this prediction from business and network performance applications. These models are applied to Telecom Italia Big Data challenge call detail records (CDRs) datasets. The performance of these models is found out using a specific well-known evaluation criteria shows that ARIMA (machine learning-based model) is more accurate as a predictive model in such a dataset than the RNN (deep learning model).

Keywords: big data analytics, machine learning, CDRs, 5G

Procedia PDF Downloads 139
88 Determining of the Performance of Data Mining Algorithm Determining the Influential Factors and Prediction of Ischemic Stroke: A Comparative Study in the Southeast of Iran

Authors: Y. Mehdipour, S. Ebrahimi, A. Jahanpour, F. Seyedzaei, B. Sabayan, A. Karimi, H. Amirifard

Abstract:

Ischemic stroke is one of the common reasons for disability and mortality. The fourth leading cause of death in the world and the third in some other sources. Only 1/3 of the patients with ischemic stroke fully recover, 1/3 of them end in permanent disability and 1/3 face death. Thus, the use of predictive models to predict stroke has a vital role in reducing the complications and costs related to this disease. Thus, the aim of this study was to specify the effective factors and predict ischemic stroke with the help of DM methods. The present study was a descriptive-analytic study. The population was 213 cases from among patients referring to Ali ibn Abi Talib (AS) Hospital in Zahedan. Data collection tool was a checklist with the validity and reliability confirmed. This study used DM algorithms of decision tree for modeling. Data analysis was performed using SPSS-19 and SPSS Modeler 14.2. The results of the comparison of algorithms showed that CHAID algorithm with 95.7% accuracy has the best performance. Moreover, based on the model created, factors such as anemia, diabetes mellitus, hyperlipidemia, transient ischemic attacks, coronary artery disease, and atherosclerosis are the most effective factors in stroke. Decision tree algorithms, especially CHAID algorithm, have acceptable precision and predictive ability to determine the factors affecting ischemic stroke. Thus, by creating predictive models through this algorithm, will play a significant role in decreasing the mortality and disability caused by ischemic stroke.

Keywords: data mining, ischemic stroke, decision tree, Bayesian network

Procedia PDF Downloads 174
87 Non-Linear Regression Modeling for Composite Distributions

Authors: Mostafa Aminzadeh, Min Deng

Abstract:

Modeling loss data is an important part of actuarial science. Actuaries use models to predict future losses and manage financial risk, which can be beneficial for marketing purposes. In the insurance industry, small claims happen frequently while large claims are rare. Traditional distributions such as Normal, Exponential, and inverse-Gaussian are not suitable for describing insurance data, which often show skewness and fat tails. Several authors have studied classical and Bayesian inference for parameters of composite distributions, such as Exponential-Pareto, Weibull-Pareto, and Inverse Gamma-Pareto. These models separate small to moderate losses from large losses using a threshold parameter. This research introduces a computational approach using a nonlinear regression model for loss data that relies on multiple predictors. Simulation studies were conducted to assess the accuracy of the proposed estimation method. The simulations confirmed that the proposed method provides precise estimates for regression parameters. It's important to note that this approach can be applied to datasets if goodness-of-fit tests confirm that the composite distribution under study fits the data well. To demonstrate the computations, a real data set from the insurance industry is analyzed. A Mathematica code uses the Fisher information algorithm as an iteration method to obtain the maximum likelihood estimation (MLE) of regression parameters.

Keywords: maximum likelihood estimation, fisher scoring method, non-linear regression models, composite distributions

Procedia PDF Downloads 33
86 Short Text Classification Using Part of Speech Feature to Analyze Students' Feedback of Assessment Components

Authors: Zainab Mutlaq Ibrahim, Mohamed Bader-El-Den, Mihaela Cocea

Abstract:

Students' textual feedback can hold unique patterns and useful information about learning process, it can hold information about advantages and disadvantages of teaching methods, assessment components, facilities, and other aspects of teaching. The results of analysing such a feedback can form a key point for institutions’ decision makers to advance and update their systems accordingly. This paper proposes a data mining framework for analysing end of unit general textual feedback using part of speech feature (PoS) with four machine learning algorithms: support vector machines, decision tree, random forest, and naive bays. The proposed framework has two tasks: first, to use the above algorithms to build an optimal model that automatically classifies the whole data set into two subsets, one subset is tailored to assessment practices (assessment related), and the other one is the non-assessment related data. Second task to use the same algorithms to build an optimal model for whole data set, and the new data subsets to automatically detect their sentiment. The significance of this paper is to compare the performance of the above four algorithms using part of speech feature to the performance of the same algorithms using n-grams feature. The paper follows Knowledge Discovery and Data Mining (KDDM) framework to construct the classification and sentiment analysis models, which is understanding the assessment domain, cleaning and pre-processing the data set, selecting and running the data mining algorithm, interpreting mined patterns, and consolidating the discovered knowledge. The results of this paper experiments show that both models which used both features performed very well regarding first task. But regarding the second task, models that used part of speech feature has underperformed in comparison with models that used unigrams and bigrams.

Keywords: assessment, part of speech, sentiment analysis, student feedback

Procedia PDF Downloads 142
85 Laboratory-Based Monitoring of Hepatitis B Virus Vaccination Status in North Central Nigeria

Authors: Nwadioha Samuel Iheanacho, Abah Paul, Odimayo Simidele Michael

Abstract:

Background: The World Health Assembly through the Global Health Sector Strategy on viral hepatitis calls for the elimination of viral hepatitis as a public health threat by 2030. All hands are on deck to actualize this goal through an effective and active vaccination and monitoring tool. Aim: To combine the Epidemiologic with Laboratory Hepatitis B Virus vaccination monitoring tools. Method: Laboratory results analysis of subjects recruited during the World Hepatitis week from July 2020 to July 2021 was done after obtaining their epidemiologic data on Hepatitis B virus risk factors, in the Medical Microbiology Laboratory of Benue State University Teaching Hospital, Nigeria. Result: A total of 500 subjects comprising males 60.0%(n=300/500) and females 40.0%(n=200/500) were recruited. A fifty-three percent majority was of the age range of 26 to 36 years. Serologic profiles were as follows, 15.0%(n=75/500) HBsAg; 7.0% (n=35/500) HBeAg; 8.0% (n=40/500) Anti-Hbe; 20.0% (n=100/500) Anti-HBc and 38.0% (n=190/500) Anti-HBs. Immune responses to vaccination were as follows, 47.0%(n=235/500) Immune naïve {no serologic marker + normal ALT}; 33%(n=165/500) Immunity by vaccination {Anti-HBs + normal ALT}; 5%(n=25/500) Immunity to previous infection {Anti-HBs, Anti-HBc, +/- Anti-HBe + normal ALT}; 8%(n=40/500) Carriers {HBsAg, Anti-HBc, Anti-HBe +normal ALT} and 7% (35/500) Anti-HBe serum- negative infections {HBsAg, HBeAg, Anti-HBc +elevated ALT}. Conclusion: The present 33.0% immunity by vaccination coverage in Central Nigeria was much lower than the 41.0% national peak in 2013, and a far cry from the global expectation of attainment of a Global Health Sector Strategy on the elimination of viral hepatitis as a public health threat by 2030. Therefore, more creative ideas and collective effort are needed to attain this goal of the World Health Assembly.

Keywords: Hepatitis B, vaccination status, laboratory tools, resource-limited settings

Procedia PDF Downloads 74
84 Price Effect Estimation of Tobacco on Low-wage Male Smokers: A Causal Mediation Analysis

Authors: Kawsar Ahmed, Hong Wang

Abstract:

The study's goal was to estimate the causal mediation impact of tobacco tax before and after price hikes among low-income male smokers, with a particular emphasis on the effect estimating pathways framework for continuous and dichotomous variables. From July to December 2021, a cross-sectional investigation of observational data (n=739) was collected from Bangladeshi low-wage smokers. The Quasi-Bayesian technique, binomial probit model, and sensitivity analysis using a simulation of the computational tools R mediation package had been used to estimate the effect. After a price rise for tobacco products, the average number of cigarettes or bidis sticks taken decreased from 6.7 to 4.56. Tobacco product rising prices have a direct effect on low-income people's decisions to quit or lessen their daily smoking habits of Average Causal Mediation Effect (ACME) [effect=2.31, 95 % confidence interval (C.I.) = (4.71-0.00), p<0.01], Average Direct Effect (ADE) [effect=8.6, 95 percent (C.I.) = (6.8-0.11), p<0.001], and overall significant effects (p<0.001). Tobacco smoking choice is described by the mediated proportion of income effect, which is 26.1% less of following price rise. The curve of ACME and ADE is based on observational figures of the coefficients of determination that asses the model of hypothesis as the substantial consequence after price rises in the sensitivity analysis. To reduce smoking product behaviors, price increases through taxation have a positive causal mediation with income that affects the decision to limit tobacco use and promote low-income men's healthcare policy.

Keywords: causal mediation analysis, directed acyclic graphs, tobacco price policy, sensitivity analysis, pathway estimation

Procedia PDF Downloads 112
83 Event Driven Dynamic Clustering and Data Aggregation in Wireless Sensor Network

Authors: Ashok V. Sutagundar, Sunilkumar S. Manvi

Abstract:

Energy, delay and bandwidth are the prime issues of wireless sensor network (WSN). Energy usage optimization and efficient bandwidth utilization are important issues in WSN. Event triggered data aggregation facilitates such optimal tasks for event affected area in WSN. Reliable delivery of the critical information to sink node is also a major challenge of WSN. To tackle these issues, we propose an event driven dynamic clustering and data aggregation scheme for WSN that enhances the life time of the network by minimizing redundant data transmission. The proposed scheme operates as follows: (1) Whenever the event is triggered, event triggered node selects the cluster head. (2) Cluster head gathers data from sensor nodes within the cluster. (3) Cluster head node identifies and classifies the events out of the collected data using Bayesian classifier. (4) Aggregation of data is done using statistical method. (5) Cluster head discovers the paths to the sink node using residual energy, path distance and bandwidth. (6) If the aggregated data is critical, cluster head sends the aggregated data over the multipath for reliable data communication. (7) Otherwise aggregated data is transmitted towards sink node over the single path which is having the more bandwidth and residual energy. The performance of the scheme is validated for various WSN scenarios to evaluate the effectiveness of the proposed approach in terms of aggregation time, cluster formation time and energy consumed for aggregation.

Keywords: wireless sensor network, dynamic clustering, data aggregation, wireless communication

Procedia PDF Downloads 450
82 Predictive Value of Hepatitis B Core-Related Antigen (HBcrAg) during Natural History of Hepatitis B Virus Infection

Authors: Yanhua Zhao, Yu Gou, Shu Feng, Dongdong Li, Chuanmin Tao

Abstract:

The natural history of HBV infection could experience immune tolerant (IT), immune clearance (IC), HBeAg-negative inactive/quienscent carrier (ENQ), and HBeAg-negative hepatitis (ENH). As current biomarkers for discriminating these four phases have some weaknesses, additional serological indicators are needed. Hepatits B core-related antigen (HBcrAg) encoded with precore/core gene contains denatured HBeAg, HBV core antigen (HBcAg) and a 22KDa precore protein (p22cr), which was demonstrated to have a close association with natural history of hepatitis B infection, but no specific cutoff values and diagnostic parameters to evaluate the diagnostic efficacy. This study aimed to clarify the distribution of HBcrAg levels and evaluate its diagnostic performance during the natural history of infection from a Western Chinese perspective. 294 samples collected from treatment-naïve chronic hepatitis B (CHB) patients in different phases (IT=64; IC=72; ENQ=100, and ENH=58). We detected the HBcrAg values and analyzed the relationship between HBcrAg and HBV DNA. HBsAg and other clinical parameters were quantitatively tested. HBcrAg levels of four phases were 9.30 log U/mL, 8.80 log U/mL, 3.00 log U/mL, and 5.10 logU/mL, respectively (p < 0.0001). Receiver operating characteristic curve analysis demonstrated that the area under curves (AUCs) of HBcrAg and quantitative HBsAg at cutoff values of 9.25 log U/mL and 4.355 log IU/mL for distinguishing IT from IC phases were 0.704 and 0.694, with sensitivity 76.39% and 59.72%, specificity 53.13% and 79.69%, respectively. AUCs of HBcrAg and quantitative HBsAg at cutoff values of 4.15 log U/mlmL and 2.395 log IU/mlmL for discriminating between ENQ and ENH phases were 0.931 and 0.653, with sensitivity 87.93% and 84%, specificity 91.38% and 39%, respectively. Therefore, HBcrAg levels varied significantly among four natural phases of HBV infection. It had higher predictive performance than quantitative HBsAg for distinguishing between ENQ-patients and ENH-patients and similar performance with HBsAg for the discrimination between IT and IC phases, which indicated that HBcrAg could be a potential serological marker for CHB.

Keywords: chronic hepatitis B, hepatitis B core-related antigen, hepatitis B surface antigens, hepatitis B virus

Procedia PDF Downloads 417
81 Data Mining Model for Predicting the Status of HIV Patients during Drug Regimen Change

Authors: Ermias A. Tegegn, Million Meshesha

Abstract:

Human Immunodeficiency Virus and Acquired Immunodeficiency Syndrome (HIV/AIDS) is a major cause of death for most African countries. Ethiopia is one of the seriously affected countries in sub Saharan Africa. Previously in Ethiopia, having HIV/AIDS was almost equivalent to a death sentence. With the introduction of Antiretroviral Therapy (ART), HIV/AIDS has become chronic, but manageable disease. The study focused on a data mining technique to predict future living status of HIV/AIDS patients at the time of drug regimen change when the patients become toxic to the currently taking ART drug combination. The data is taken from University of Gondar Hospital ART program database. Hybrid methodology is followed to explore the application of data mining on ART program dataset. Data cleaning, handling missing values and data transformation were used for preprocessing the data. WEKA 3.7.9 data mining tools, classification algorithms, and expertise are utilized as means to address the research problem. By using four different classification algorithms, (i.e., J48 Classifier, PART rule induction, Naïve Bayes and Neural network) and by adjusting their parameters thirty-two models were built on the pre-processed University of Gondar ART program dataset. The performances of the models were evaluated using the standard metrics of accuracy, precision, recall, and F-measure. The most effective model to predict the status of HIV patients with drug regimen substitution is pruned J48 decision tree with a classification accuracy of 98.01%. This study extracts interesting attributes such as Ever taking Cotrim, Ever taking TbRx, CD4 count, Age, Weight, and Gender so as to predict the status of drug regimen substitution. The outcome of this study can be used as an assistant tool for the clinician to help them make more appropriate drug regimen substitution. Future research directions are forwarded to come up with an applicable system in the area of the study.

Keywords: HIV drug regimen, data mining, hybrid methodology, predictive model

Procedia PDF Downloads 142
80 The role of Financial Development and Institutional Quality in Promoting Sustainable Development through Tourism Management

Authors: Hashim Zameer

Abstract:

Effective tourism management plays a vital role in promoting sustainability and supporting ecosystems. A common principle that has been in practice over the years is “first pollute and then clean,” indicating countries need financial resources to promote sustainability. Financial development and the tourism management both seems very important to promoting sustainable development. However, without institutional support, it is very difficult to succeed. In this context, it seems prominently significant to explore how institutional quality, tourism development, and financial development could promote sustainable development. In the past, no research explored the role of tourism development in sustainable development. Moreover, the role of financial development, natural resources, and institutional quality in sustainable development is also ignored. In this regard, this paper aims to investigate the role of tourism development, natural resources, financial development, and institutional quality in sustainable development in China. The study used time-series data from 2000–2021 and employed the Bayesian linear regression model because it is suitable for small data sets. The robustness of the findings was checked using a quantile regression approach. The results reveal that an increase in tourism expenditures stimulates the economy, creates jobs, encourages cultural exchange, and supports sustainability initiatives. Moreover, financial development and institution quality have a positive effect on sustainable development. However, reliance on natural resources can result in negative economic, social, and environmental outcomes, highlighting the need for resource diversification and management to reinforce sustainable development. These results highlight the significance of financial development, strong institutions, sustainable tourism, and careful utilization of natural resources for long-term sustainability. The study holds vital insights for policy formulation to promote sustainable tourism.

Keywords: sustainability, tourism development, financial development, institutional quality

Procedia PDF Downloads 82
79 Effects of Lung Protection Ventilation Strategies on Postoperative Pulmonary Complications After Noncardiac Surgery: A Network Meta-Analysis of Randomized Controlled Trials

Authors: Ran An, Dang Wang

Abstract:

Background: Mechanical ventilation has been confirmed to increase the incidence of postoperative pulmonary complications (PPCs), and several studies have shown that low tidal volumes combined with positive end-expiratory pressure (PEEP) and recruitment manoeuvres (RM) reduce the incidence of PPCs. However, the optimal lung-protective ventilatory strategy remains unclear. Methods: Multiple databases were searched for randomized controlled trials (RCTs) published prior to October 2023. The association between individual PEEP (iPEEP) or other forms of lung-protective ventilation and the incidence of PPCs was evaluated by Bayesian network meta-analysis. Results: We included 58 studies (11610 patients) in this meta-analysis. The network meta-analysis showed that low ventilation (LVt) combined with iPEEP and RM was associated with significantly lower incidences of PPCs [HVt: OR=0.38 95CrI (0.19, 0.75), LVt: OR=0.33, 95% CrI (0.12, 0.82)], postoperative atelectasis, and pneumonia than was HVt or LVt. In abdominal surgery, LVT combined with iPEEP or medium-to-high PEEP and RM were associated with significantly lower incidences of PPCs, postoperative atelectasis, and pneumonia. LVt combined with iPEEP and RM was ranked the highest, which was based on SUCRA scores. Conclusion: LVt combined with iPEEP and RM decreased the incidences of PPCs, postoperative atelectasis, and pneumonia in noncardiac surgery patients. iPEEP-guided ventilation was the optimal lung protection ventilation strategy. The quality of evidence was moderate.

Keywords: protection ventilation strategies, postoperative pulmonary complications, network meta-analysis, noncardiac surgery

Procedia PDF Downloads 35
78 Advancements in Predicting Diabetes Biomarkers: A Machine Learning Epigenetic Approach

Authors: James Ladzekpo

Abstract:

Background: The urgent need to identify new pharmacological targets for diabetes treatment and prevention has been amplified by the disease's extensive impact on individuals and healthcare systems. A deeper insight into the biological underpinnings of diabetes is crucial for the creation of therapeutic strategies aimed at these biological processes. Current predictive models based on genetic variations fall short of accurately forecasting diabetes. Objectives: Our study aims to pinpoint key epigenetic factors that predispose individuals to diabetes. These factors will inform the development of an advanced predictive model that estimates diabetes risk from genetic profiles, utilizing state-of-the-art statistical and data mining methods. Methodology: We have implemented a recursive feature elimination with cross-validation using the support vector machine (SVM) approach for refined feature selection. Building on this, we developed six machine learning models, including logistic regression, k-Nearest Neighbors (k-NN), Naive Bayes, Random Forest, Gradient Boosting, and Multilayer Perceptron Neural Network, to evaluate their performance. Findings: The Gradient Boosting Classifier excelled, achieving a median recall of 92.17% and outstanding metrics such as area under the receiver operating characteristics curve (AUC) with a median of 68%, alongside median accuracy and precision scores of 76%. Through our machine learning analysis, we identified 31 genes significantly associated with diabetes traits, highlighting their potential as biomarkers and targets for diabetes management strategies. Conclusion: Particularly noteworthy were the Gradient Boosting Classifier and Multilayer Perceptron Neural Network, which demonstrated potential in diabetes outcome prediction. We recommend future investigations to incorporate larger cohorts and a wider array of predictive variables to enhance the models' predictive capabilities.

Keywords: diabetes, machine learning, prediction, biomarkers

Procedia PDF Downloads 55
77 Bioinformatic Approaches in Population Genetics and Phylogenetic Studies

Authors: Masoud Sheidai

Abstract:

Biologists with a special field of population genetics and phylogeny have different research tasks such as populations’ genetic variability and divergence, species relatedness, the evolution of genetic and morphological characters, and identification of DNA SNPs with adaptive potential. To tackle these problems and reach a concise conclusion, they must use the proper and efficient statistical and bioinformatic methods as well as suitable genetic and morphological characteristics. In recent years application of different bioinformatic and statistical methods, which are based on various well-documented assumptions, are the proper analytical tools in the hands of researchers. The species delineation is usually carried out with the use of different clustering methods like K-means clustering based on proper distance measures according to the studied features of organisms. A well-defined species are assumed to be separated from the other taxa by molecular barcodes. The species relationships are studied by using molecular markers, which are analyzed by different analytical methods like multidimensional scaling (MDS) and principal coordinate analysis (PCoA). The species population structuring and genetic divergence are usually investigated by PCoA and PCA methods and a network diagram. These are based on bootstrapping of data. The Association of different genes and DNA sequences to ecological and geographical variables is determined by LFMM (Latent factor mixed model) and redundancy analysis (RDA), which are based on Bayesian and distance methods. Molecular and morphological differentiating characters in the studied species may be identified by linear discriminant analysis (DA) and discriminant analysis of principal components (DAPC). We shall illustrate these methods and related conclusions by giving examples from different edible and medicinal plant species.

Keywords: GWAS analysis, K-Means clustering, LFMM, multidimensional scaling, redundancy analysis

Procedia PDF Downloads 124
76 New Wine in an Old Bottle? Zhong-Yong Thinking and Creativity

Authors: Li-Fang CHou, Chun-Jung Tseng, Sung-Chun Tsai

Abstract:

Zhong-Yong represents unique values and cognitive beliefs of Chinese culture. Zhong-Yong thinking emphasizes (a) holistic thinking and perspective taking, (b) tolerance of contradictions, and (c) pursuance of a person’s interpersonal and inner harmony. With a unique way of naïve dialectical thinking based on Chinese culture, previous studies have found that people with higher Zhong-Yong thinking have more cognitive resources and resilience to make decision for dilemmas and cope stresses. Creativity is defined as the behavior to create novel and value products and viewed as the most important capital for individuals and enterprises. However, the relationship between Zhong-Yong thinking and creativity is still remaining to be unexplored. Three studies were conducted to explore the effects of Zhong-Yong thinking on creativity. In Study1, with 87 undergraduate students from a university in southern Taiwan as participants, we used questionnaire to measure Zhong-Yong thinking and processed creative task (unusual uses task) to get indicators of fluency and flexibility. After controlling background and openness to experience of Big five, the results showed that Zhong-Yong thinking had significant positive effects on fluency and flexibility. In Study 2, 97 undergraduate students were recruited to do Zhong-Yong thinking task and creative task. The result showed that, compared with control group, the participants had higher creative performance after being primed with Zhong-Yong thinking. In Study 3, we adopted questionnaire survey and took 397 employees from private enterprises in Taiwan as sample. Besides the main effects of Zhong-Yong thinking, the moderating effects on the relationship between leadership behavior and employee’s creative performance were also investigated. We found that (a) Zhong-Yong thinking was positively associated to creative performance; (b) Zhong-Yong thinking strengthened the positive effects of transformational and authoritative leadership on creative performance. Finally, the implications of theory/practice and limitations/future directions were also discussed.

Keywords: Zhong-Yong thinking, creativity and creative performance, unusual uses task, transformational leadership, authoritative leadership

Procedia PDF Downloads 582
75 Molecular Identification and Evolutionary Status of Lucilia bufonivora: An Obligate Parasite of Amphibians in Europe

Authors: Gerardo Arias, Richard Wall, Jamie Stevens

Abstract:

Lucilia bufonivora Moniez, is an obligate parasite of toads and frogs widely distributed in Europe. Its sister taxon Lucilia silvarum Meigen behaves mainly as a carrion breeder in Europe, however it has been reported as a facultative parasite of amphibians. These two closely related species are morphologically almost identical, which has led to misidentification, and in fact, it has been suggested that the amphibian myiasis cases by L. silvarum reported in Europe should be attributed to L. bufonivora. Both species remain poorly studied and their taxonomic relationships are still unclear. The identification of the larval specimens involved in amphibian myiasis with molecular tools and phylogenetic analysis of these two closely related species may resolve this problem. In this work seventeen unidentified larval specimens extracted from toad myiasis cases of the UK, the Netherlands and Switzerland were obtained, their COX1 (mtDNA) and EF1-α (Nuclear DNA) gene regions were amplified and then sequenced. The 17 larval samples were identified with both molecular markers as L. bufonivora. Phylogenetic analysis was carried out with 10 other blowfly species, including L. silvarum samples from the UK and USA. Bayesian Inference trees of COX1 and a combined-gene dataset suggested that L. silvarum and L. bufonivora are separate sister species. However, the nuclear gene EF1-α does not appear to resolve their relationships, suggesting that the rates of evolution of the mtDNA are much faster than those of the nuclear DNA. This work provides the molecular evidence for successful identification of L. bufonivora and a molecular analysis of the populations of this obligate parasite from different locations across Europe. The relationships with L. silvarum are discussed.

Keywords: calliphoridae, molecular evolution, myiasis, obligate parasitism

Procedia PDF Downloads 242
74 Principal Component Analysis Combined Machine Learning Techniques on Pharmaceutical Samples by Laser Induced Breakdown Spectroscopy

Authors: Kemal Efe Eseller, Göktuğ Yazici

Abstract:

Laser-induced breakdown spectroscopy (LIBS) is a rapid optical atomic emission spectroscopy which is used for material identification and analysis with the advantages of in-situ analysis, elimination of intensive sample preparation, and micro-destructive properties for the material to be tested. LIBS delivers short pulses of laser beams onto the material in order to create plasma by excitation of the material to a certain threshold. The plasma characteristics, which consist of wavelength value and intensity amplitude, depends on the material and the experiment’s environment. In the present work, medicine samples’ spectrum profiles were obtained via LIBS. Medicine samples’ datasets include two different concentrations for both paracetamol based medicines, namely Aferin and Parafon. The spectrum data of the samples were preprocessed via filling outliers based on quartiles, smoothing spectra to eliminate noise and normalizing both wavelength and intensity axis. Statistical information was obtained and principal component analysis (PCA) was incorporated to both the preprocessed and raw datasets. The machine learning models were set based on two different train-test splits, which were 70% training – 30% test and 80% training – 20% test. Cross-validation was preferred to protect the models against overfitting; thus the sample amount is small. The machine learning results of preprocessed and raw datasets were subjected to comparison for both splits. This is the first time that all supervised machine learning classification algorithms; consisting of Decision Trees, Discriminant, naïve Bayes, Support Vector Machines (SVM), k-NN(k-Nearest Neighbor) Ensemble Learning and Neural Network algorithms; were incorporated to LIBS data of paracetamol based pharmaceutical samples, and their different concentrations on preprocessed and raw dataset in order to observe the effect of preprocessing.

Keywords: machine learning, laser-induced breakdown spectroscopy, medicines, principal component analysis, preprocessing

Procedia PDF Downloads 87
73 An Ensemble System of Classifiers for Computer-Aided Volcano Monitoring

Authors: Flavio Cannavo

Abstract:

Continuous evaluation of the status of potentially hazardous volcanos plays a key role for civil protection purposes. The importance of monitoring volcanic activity, especially for energetic paroxysms that usually come with tephra emissions, is crucial not only for exposures to the local population but also for airline traffic. Presently, real-time surveillance of most volcanoes worldwide is essentially delegated to one or more human experts in volcanology, who interpret data coming from different kind of monitoring networks. Unfavorably, the high nonlinearity of the complex and coupled volcanic dynamics leads to a large variety of different volcanic behaviors. Moreover, continuously measured parameters (e.g. seismic, deformation, infrasonic and geochemical signals) are often not able to fully explain the ongoing phenomenon, thus making the fast volcano state assessment a very puzzling task for the personnel on duty at the control rooms. With the aim of aiding the personnel on duty in volcano surveillance, here we introduce a system based on an ensemble of data-driven classifiers to infer automatically the ongoing volcano status from all the available different kind of measurements. The system consists of a heterogeneous set of independent classifiers, each one built with its own data and algorithm. Each classifier gives an output about the volcanic status. The ensemble technique allows weighting the single classifier output to combine all the classifications into a single status that maximizes the performance. We tested the model on the Mt. Etna (Italy) case study by considering a long record of multivariate data from 2011 to 2015 and cross-validated it. Results indicate that the proposed model is effective and of great power for decision-making purposes.

Keywords: Bayesian networks, expert system, mount Etna, volcano monitoring

Procedia PDF Downloads 246
72 Reliability Qualification Test Plan Derivation Method for Weibull Distributed Products

Authors: Ping Jiang, Yunyan Xing, Dian Zhang, Bo Guo

Abstract:

The reliability qualification test (RQT) is widely used in product development to qualify whether the product meets predetermined reliability requirements, which are mainly described in terms of reliability indices, for example, MTBF (Mean Time Between Failures). It is widely exercised in product development. In engineering practices, RQT plans are mandatorily referred to standards, such as MIL-STD-781 or GJB899A-2009. But these conventional RQT plans in standards are not preferred, as the test plans often require long test times or have high risks for both producer and consumer due to the fact that the methods in the standards only use the test data of the product itself. And the standards usually assume that the product is exponentially distributed, which is not suitable for a complex product other than electronics. So it is desirable to develop an RQT plan derivation method that safely shortens test time while keeping the two risks under control. To meet this end, for the product whose lifetime follows Weibull distribution, an RQT plan derivation method is developed. The merit of the method is that expert judgment is taken into account. This is implemented by applying the Bayesian method, which translates the expert judgment into prior information on product reliability. Then producer’s risk and the consumer’s risk are calculated accordingly. The procedures to derive RQT plans are also proposed in this paper. As extra information and expert judgment are added to the derivation, the derived test plans have the potential to shorten the required test time and have satisfactory low risks for both producer and consumer, compared with conventional test plans. A case study is provided to prove that when using expert judgment in deriving product test plans, the proposed method is capable of finding ideal test plans that not only reduce the two risks but also shorten the required test time as well.

Keywords: expert judgment, reliability qualification test, test plan derivation, producer’s risk, consumer’s risk

Procedia PDF Downloads 137
71 Artificial Intelligence Techniques for Enhancing Supply Chain Resilience: A Systematic Literature Review, Holistic Framework, and Future Research

Authors: Adane Kassa Shikur

Abstract:

Today’s supply chains (SC) have become vulnerable to unexpected and ever-intensifying disruptions from myriad sources. Consequently, the concept of supply chain resilience (SCRes) has become crucial to complement the conventional risk management paradigm, which has failed to cope with unexpected SC disruptions, resulting in severe consequences affecting SC performances and making business continuity questionable. Advancements in cutting-edge technologies like artificial intelligence (AI) and their potential to enhance SCRes by improving critical antecedents in the different phases have attracted the attention of scholars and practitioners. The research from academia and the practical interest of the industry have yielded significant publications at the nexus of AI and SCRes during the last two decades. However, the applications and examinations have been primarily conducted independently, and the extant literature is dispersed into research streams despite the complex nature of SCRes. To close this research gap, this study conducts a systematic literature review of 106 peer-reviewed articles by curating, synthesizing, and consolidating up-to-date literature and presents the state-of-the-art development from 2010 to 2022. Bayesian networks are the most topical ones among the 13 AI techniques evaluated. Concerning the critical antecedents, visibility is the first ranking to be realized by the techniques. The study revealed that AI techniques support only the first 3 phases of SCRes (readiness, response, and recovery), and readiness is the most popular one, while no evidence has been found for the growth phase. The study proposed an AI-SCRes framework to inform research and practice to approach SCRes holistically. It also provided implications for practice, policy, and theory as well as gaps for impactful future research.

Keywords: ANNs, risk, Bauesian networks, vulnerability, resilience

Procedia PDF Downloads 95
70 Forecasting Lake Malawi Water Level Fluctuations Using Stochastic Models

Authors: M. Mulumpwa, W. W. L. Jere, M. Lazaro, A. H. N. Mtethiwa

Abstract:

The study considered Seasonal Autoregressive Integrated Moving Average (SARIMA) processes to select an appropriate stochastic model to forecast the monthly data from the Lake Malawi water levels for the period 1986 through 2015. The appropriate model was chosen based on SARIMA (p, d, q) (P, D, Q)S. The Autocorrelation function (ACF), Partial autocorrelation (PACF), Akaike Information Criteria (AIC), Bayesian Information Criterion (BIC), Box–Ljung statistics, correlogram and distribution of residual errors were estimated. The SARIMA (1, 1, 0) (1, 1, 1)12 was selected to forecast the monthly data of the Lake Malawi water levels from August, 2015 to December, 2021. The plotted time series showed that the Lake Malawi water levels are decreasing since 2010 to date but not as much as was the case in 1995 through 1997. The future forecast of the Lake Malawi water levels until 2021 showed a mean of 474.47 m ranging from 473.93 to 475.02 meters with a confidence interval of 80% and 90% against registered mean of 473.398 m in 1997 and 475.475 m in 1989 which was the lowest and highest water levels in the lake respectively since 1986. The forecast also showed that the water levels of Lake Malawi will drop by 0.57 meters as compared to the mean water levels recorded in the previous years. These results suggest that the Lake Malawi water level may not likely go lower than that recorded in 1997. Therefore, utilisation and management of water-related activities and programs among others on the lake should provide room for such scenarios. The findings suggest a need to manage the Lake Malawi jointly and prudently with other stakeholders starting from the catchment area. This will reduce impacts of anthropogenic activities on the lake’s water quality, water level, aquatic and adjacent terrestrial ecosystems thereby ensuring its resilience to climate change impacts.

Keywords: forecasting, Lake Malawi, water levels, water level fluctuation, climate change, anthropogenic activities

Procedia PDF Downloads 230
69 Constructing a Semi-Supervised Model for Network Intrusion Detection

Authors: Tigabu Dagne Akal

Abstract:

While advances in computer and communications technology have made the network ubiquitous, they have also rendered networked systems vulnerable to malicious attacks devised from a distance. These attacks or intrusions start with attackers infiltrating a network through a vulnerable host and then launching further attacks on the local network or Intranet. Nowadays, system administrators and network professionals can attempt to prevent such attacks by developing intrusion detection tools and systems using data mining technology. In this study, the experiments were conducted following the Knowledge Discovery in Database Process Model. The Knowledge Discovery in Database Process Model starts from selection of the datasets. The dataset used in this study has been taken from Massachusetts Institute of Technology Lincoln Laboratory. After taking the data, it has been pre-processed. The major pre-processing activities include fill in missed values, remove outliers; resolve inconsistencies, integration of data that contains both labelled and unlabelled datasets, dimensionality reduction, size reduction and data transformation activity like discretization tasks were done for this study. A total of 21,533 intrusion records are used for training the models. For validating the performance of the selected model a separate 3,397 records are used as a testing set. For building a predictive model for intrusion detection J48 decision tree and the Naïve Bayes algorithms have been tested as a classification approach for both with and without feature selection approaches. The model that was created using 10-fold cross validation using the J48 decision tree algorithm with the default parameter values showed the best classification accuracy. The model has a prediction accuracy of 96.11% on the training datasets and 93.2% on the test dataset to classify the new instances as normal, DOS, U2R, R2L and probe classes. The findings of this study have shown that the data mining methods generates interesting rules that are crucial for intrusion detection and prevention in the networking industry. Future research directions are forwarded to come up an applicable system in the area of the study.

Keywords: intrusion detection, data mining, computer science, data mining

Procedia PDF Downloads 296
68 Identification of New Familial Breast Cancer Susceptibility Genes: Are We There Yet?

Authors: Ian Campbell, Gillian Mitchell, Paul James, Na Li, Ella Thompson

Abstract:

The genetic cause of the majority of multiple-case breast cancer families remains unresolved. Next generation sequencing has emerged as an efficient strategy for identifying predisposing mutations in individuals with inherited cancer. We are conducting whole exome sequence analysis of germ line DNA from multiple affected relatives from breast cancer families, with the aim of identifying rare protein truncating and non-synonymous variants that are likely to include novel cancer predisposing mutations. Data from more than 200 exomes show that on average each individual carries 30-50 protein truncating mutations and 300-400 rare non-synonymous variants. Heterogeneity among our exome data strongly suggest that numerous moderate penetrance genes remain to be discovered, with each gene individually accounting for only a small fraction of families (~0.5%). This scenario marks validation of candidate breast cancer predisposing genes in large case-control studies as the rate-limiting step in resolving the missing heritability of breast cancer. The aim of this study is to screen genes that are recurrently mutated among our exome data in a larger cohort of cases and controls to assess the prevalence of inactivating mutations that may be associated with breast cancer risk. We are using the Agilent HaloPlex Target Enrichment System to screen the coding regions of 168 genes in 1,000 BRCA1/2 mutation-negative familial breast cancer cases and 1,000 cancer-naive controls. To date, our interim analysis has identified 21 genes which carry an excess of truncating mutations in multiple breast cancer families versus controls. Established breast cancer susceptibility gene PALB2 is the most frequently mutated gene (13/998 cases versus 0/1009 controls), but other interesting candidates include NPSR1, GSN, POLD2, and TOX3. These and other genes are being validated in a second cohort of 1,000 cases and controls. Our experience demonstrates that beyond PALB2, the prevalence of mutations in the remaining breast cancer predisposition genes is likely to be very low making definitive validation exceptionally challenging.

Keywords: predisposition, familial, exome sequencing, breast cancer

Procedia PDF Downloads 493
67 Self-Disclosure and Privacy Management Behavior in Social Media: Privacy Calculus Perspective

Authors: Chien-Wen Chen, Nguyen Duong Thuy Trang, Yu-Hsuan Chang

Abstract:

With the development of information technology, social networking sites are inseparable from life and have become an important way for people to communicate. Nonetheless, privacy issues are raised by the presence of personal information on social networking sites. However, users can benefit from using the functions of social networking sites, which also leads to users worrying about the leakage of personal information without corresponding privacy protection behaviors, which is called the privacy paradox. However, previous studies have questioned the viewpoint of the privacy paradox, believing that users are not so naive and that people with privacy concerns will conduct privacy management. Consequently, this study is based on the view of privacy calculation perspective to investigate the privacy behavior of users on social networking sites. Among them, social benefits and privacy concerns are taken as the expected benefits and costs in the viewpoint of privacy calculation. At the same time, this study also explores the antecedents, including positive feedback, self-presentation, privacy policy, and information sensitivity, and the consequence of privacy behavior of weighing benefits and costs, including self-disclosure and three privacy management strategies by interpersonal boundaries (Preventive, Censorship, and Corrective). The survey respondents' characteristics and prior use experience of social networking sites were analyzed. As a consequence, a survey of 596 social network users was conducted online to validate the research framework. The results show that social benefit has the greatest influence on privacy behavior. The most important external factors affecting privacy behavior are positive feedback, followed by the privacy policy and information sensitivity. In addition, the important findings of this study are that social benefits will positively affect privacy management. It shows that users can get satisfaction from interacting with others through social networking sites. They will not only disclose themselves but also manage their privacy on social networking sites after considering social benefits and privacy management on social networking sites, and it expands the adoption of the Privacy Calculus Perspective framework from prior research. Therefore, it is suggested that as the functions of social networking sites increase and the development of social networking sites, users' needs should be understood and updated in order to ensure the sustainable operation of social networking.

Keywords: privacy calculus perspective, self-disclosure, privacy management, social benefit, privacy concern

Procedia PDF Downloads 89
66 Improving 99mTc-tetrofosmin Myocardial Perfusion Images by Time Subtraction Technique

Authors: Yasuyuki Takahashi, Hayato Ishimura, Masao Miyagawa, Teruhito Mochizuki

Abstract:

Quantitative measurement of myocardium perfusion is possible with single photon emission computed tomography (SPECT) using a semiconductor detector. However, accumulation of 99mTc-tetrofosmin in the liver may make it difficult to assess that accurately in the inferior myocardium. Our idea is to reduce the high accumulation in the liver by using dynamic SPECT imaging and a technique called time subtraction. We evaluated the performance of a new SPECT system with a cadmium-zinc-telluride solid-state semi- conductor detector (Discovery NM 530c; GE Healthcare). Our system acquired list-mode raw data over 10 minutes for a typical patient. From the data, ten SPECT images were reconstructed, one for every minute of acquired data. Reconstruction with the semiconductor detector was based on an implementation of a 3-D iterative Bayesian reconstruction algorithm. We studied 20 patients with coronary artery disease (mean age 75.4 ± 12.1 years; range 42-86; 16 males and 4 females). In each subject, 259 MBq of 99mTc-tetrofosmin was injected intravenously. We performed both a phantom and a clinical study using dynamic SPECT. An approximation to a liver-only image is obtained by reconstructing an image from the early projections during which time the liver accumulation dominates (0.5~2.5 minutes SPECT image-5~10 minutes SPECT image). The extracted liver-only image is then subtracted from a later SPECT image that shows both the liver and the myocardial uptake (5~10 minutes SPECT image-liver-only image). The time subtraction of liver was possible in both a phantom and the clinical study. The visualization of the inferior myocardium was improved. In past reports, higher accumulation in the myocardium due to the overlap of the liver is un-diagnosable. Using our time subtraction method, the image quality of the 99mTc-tetorofosmin myocardial SPECT image is considerably improved.

Keywords: 99mTc-tetrofosmin, dynamic SPECT, time subtraction, semiconductor detector

Procedia PDF Downloads 335
65 Comparison of Different Artificial Intelligence-Based Protein Secondary Structure Prediction Methods

Authors: Jamerson Felipe Pereira Lima, Jeane Cecília Bezerra de Melo

Abstract:

The difficulty and cost related to obtaining of protein tertiary structure information through experimental methods, such as X-ray crystallography or NMR spectroscopy, helped raising the development of computational methods to do so. An approach used in these last is prediction of tridimensional structure based in the residue chain, however, this has been proved an NP-hard problem, due to the complexity of this process, explained by the Levinthal paradox. An alternative solution is the prediction of intermediary structures, such as the secondary structure of the protein. Artificial Intelligence methods, such as Bayesian statistics, artificial neural networks (ANN), support vector machines (SVM), among others, were used to predict protein secondary structure. Due to its good results, artificial neural networks have been used as a standard method to predict protein secondary structure. Recent published methods that use this technique, in general, achieved a Q3 accuracy between 75% and 83%, whereas the theoretical accuracy limit for protein prediction is 88%. Alternatively, to achieve better results, support vector machines prediction methods have been developed. The statistical evaluation of methods that use different AI techniques, such as ANNs and SVMs, for example, is not a trivial problem, since different training sets, validation techniques, as well as other variables can influence the behavior of a prediction method. In this study, we propose a prediction method based on artificial neural networks, which is then compared with a selected SVM method. The chosen SVM protein secondary structure prediction method is the one proposed by Huang in his work Extracting Physico chemical Features to Predict Protein Secondary Structure (2013). The developed ANN method has the same training and testing process that was used by Huang to validate his method, which comprises the use of the CB513 protein data set and three-fold cross-validation, so that the comparative analysis of the results can be made comparing directly the statistical results of each method.

Keywords: artificial neural networks, protein secondary structure, protein structure prediction, support vector machines

Procedia PDF Downloads 621
64 Developing a DNN Model for the Production of Biogas From a Hybrid BO-TPE System in an Anaerobic Wastewater Treatment Plant

Authors: Hadjer Sadoune, Liza Lamini, Scherazade Krim, Amel Djouadi, Rachida Rihani

Abstract:

Deep neural networks are highly regarded for their accuracy in predicting intricate fermentation processes. Their ability to learn from a large amount of datasets through artificial intelligence makes them particularly effective models. The primary obstacle in improving the performance of these models is to carefully choose the suitable hyperparameters, including the neural network architecture (number of hidden layers and hidden units), activation function, optimizer, learning rate, and other relevant factors. This study predicts biogas production from real wastewater treatment plant data using a sophisticated approach: hybrid Bayesian optimization with a tree-structured Parzen estimator (BO-TPE) for an optimised deep neural network (DNN) model. The plant utilizes an Upflow Anaerobic Sludge Blanket (UASB) digester that treats industrial wastewater from soft drinks and breweries. The digester has a working volume of 1574 m3 and a total volume of 1914 m3. Its internal diameter and height were 19 and 7.14 m, respectively. The data preprocessing was conducted with meticulous attention to preserving data quality while avoiding data reduction. Three normalization techniques were applied to the pre-processed data (MinMaxScaler, RobustScaler and StandardScaler) and compared with the Non-Normalized data. The RobustScaler approach has strong predictive ability for estimating the volume of biogas produced. The highest predicted biogas volume was 2236.105 Nm³/d, with coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE) values of 0.712, 164.610, and 223.429, respectively.

Keywords: anaerobic digestion, biogas production, deep neural network, hybrid bo-tpe, hyperparameters tuning

Procedia PDF Downloads 38