Search results for: KTH dataset
207 The Importance of including All Data in a Linear Model for the Analysis of RNAseq Data
Authors: Roxane A. Legaie, Kjiana E. Schwab, Caroline E. Gargett
Abstract:
Studies looking at the changes in gene expression from RNAseq data often make use of linear models. It is also common practice to focus on a subset of data for a comparison of interest, leaving aside the samples not involved in this particular comparison. This work shows the importance of including all observations in the modeling process to better estimate variance parameters, even when the samples included are not directly used in the comparison under test. The human endometrium is a dynamic tissue, which undergoes cycles of growth and regression with each menstrual cycle. The mesenchymal stem cells (MSCs) present in the endometrium are likely responsible for this remarkable regenerative capacity. However recent studies suggest that MSCs also plays a role in the pathogenesis of endometriosis, one of the most common medical conditions affecting the lower abdomen in women in which the endometrial tissue grows outside the womb. In this study we compared gene expression profiles between MSCs and non-stem cell counterparts (‘non-MSC’) obtained from women with (‘E’) or without (‘noE’) endometriosis from RNAseq. Raw read counts were used for differential expression analysis using a linear model with the limma-voom R package, including either all samples in the study or only the samples belonging to the subset of interest (e.g. for the comparison ‘E vs noE in MSC cells’, including only MSC samples from E and noE patients but not the non-MSC ones). Using the full dataset we identified about 100 differentially expressed (DE) genes between E and noE samples in MSC samples (adj.p-val < 0.05 and |logFC|>1) while only 9 DE genes were identified when using only the subset of data (MSC samples only). Important genes known to be involved in endometriosis such as KLF9 and RND3 were missed in the latter case. When looking at the MSC vs non-MSC cells comparison, the linear model including all samples identified 260 genes for noE samples (including the stem cell marker SUSD2) while the subset analysis did not identify any DE genes. When looking at E samples, 12 genes were identified with the first approach and only 1 with the subset approach. Although the stem cell marker RGS5 was found in both cases, the subset test missed important genes involved in stem cell differentiation such as NOTCH3 and other potentially related genes to be used for further investigation and pathway analysis.Keywords: differential expression, endometriosis, linear model, RNAseq
Procedia PDF Downloads 432206 Hybrid Method for Smart Suggestions in Conversations for Online Marketplaces
Authors: Yasamin Rahimi, Ali Kamandi, Abbas Hoseini, Hesam Haddad
Abstract:
Online/offline chat is a convenient approach in the electronic markets of second-hand products in which potential customers would like to have more information about the products to fill the information gap between buyers and sellers. Online peer in peer market is trying to create artificial intelligence-based systems that help customers ask more informative questions in an easier way. In this article, we introduce a method for the question/answer system that we have developed for the top-ranked electronic market in Iran called Divar. When it comes to secondhand products, incomplete product information in a purchase will result in loss to the buyer. One way to balance buyer and seller information of a product is to help the buyer ask more informative questions when purchasing. Also, the short time to start and achieve the desired result of the conversation was one of our main goals, which was achieved according to A/B tests results. In this paper, we propose and evaluate a method for suggesting questions and answers in the messaging platform of the e-commerce website Divar. Creating such systems is to help users gather knowledge about the product easier and faster, All from the Divar database. We collected a dataset of around 2 million messages in Persian colloquial language, and for each category of product, we gathered 500K messages, of which only 2K were Tagged, and semi-supervised methods were used. In order to publish the proposed model to production, it is required to be fast enough to process 10 million messages daily on CPU processors. In order to reach that speed, in many subtasks, faster and simplistic models are preferred over deep neural models. The proposed method, which requires only a small amount of labeled data, is currently used in Divar production on CPU processors, and 15% of buyers and seller’s messages in conversations is directly chosen from our model output, and more than 27% of buyers have used this model suggestions in at least one daily conversation.Keywords: smart reply, spell checker, information retrieval, intent detection, question answering
Procedia PDF Downloads 187205 Simulations to Predict Solar Energy Potential by ERA5 Application at North Africa
Authors: U. Ali Rahoma, Nabil Esawy, Fawzia Ibrahim Moursy, A. H. Hassan, Samy A. Khalil, Ashraf S. Khamees
Abstract:
The design of any solar energy conversion system requires the knowledge of solar radiation data obtained over a long period. Satellite data has been widely used to estimate solar energy where no ground observation of solar radiation is available, yet there are limitations on the temporal coverage of satellite data. Reanalysis is a “retrospective analysis” of the atmosphere parameters generated by assimilating observation data from various sources, including ground observation, satellites, ships, and aircraft observation with the output of NWP (Numerical Weather Prediction) models, to develop an exhaustive record of weather and climate parameters. The evaluation of the performance of reanalysis datasets (ERA-5) for North Africa against high-quality surface measured data was performed using statistical analysis. The estimation of global solar radiation (GSR) distribution over six different selected locations in North Africa during ten years from the period time 2011 to 2020. The root means square error (RMSE), mean bias error (MBE) and mean absolute error (MAE) of reanalysis data of solar radiation range from 0.079 to 0.222, 0.0145 to 0.198, and 0.055 to 0.178, respectively. The seasonal statistical analysis was performed to study seasonal variation of performance of datasets, which reveals the significant variation of errors in different seasons—the performance of the dataset changes by changing the temporal resolution of the data used for comparison. The monthly mean values of data show better performance, but the accuracy of data is compromised. The solar radiation data of ERA-5 is used for preliminary solar resource assessment and power estimation. The correlation coefficient (R2) varies from 0.93 to 99% for the different selected sites in North Africa in the present research. The goal of this research is to give a good representation for global solar radiation to help in solar energy application in all fields, and this can be done by using gridded data from European Centre for Medium-Range Weather Forecasts ECMWF and producing a new model to give a good result.Keywords: solar energy, solar radiation, ERA-5, potential energy
Procedia PDF Downloads 211204 An ANOVA-based Sequential Forward Channel Selection Framework for Brain-Computer Interface Application based on EEG Signals Driven by Motor Imagery
Authors: Forouzan Salehi Fergeni
Abstract:
Converting the movement intents of a person into commands for action employing brain signals like electroencephalogram signals is a brain-computer interface (BCI) system. When left or right-hand motions are imagined, different patterns of brain activity appear, which can be employed as BCI signals for control. To make better the brain-computer interface (BCI) structures, effective and accurate techniques for increasing the classifying precision of motor imagery (MI) based on electroencephalography (EEG) are greatly needed. Subject dependency and non-stationary are two features of EEG signals. So, EEG signals must be effectively processed before being used in BCI applications. In the present study, after applying an 8 to 30 band-pass filter, a car spatial filter is rendered for the purpose of denoising, and then, a method of analysis of variance is used to select more appropriate and informative channels from a category of a large number of different channels. After ordering channels based on their efficiencies, a sequential forward channel selection is employed to choose just a few reliable ones. Features from two domains of time and wavelet are extracted and shortlisted with the help of a statistical technique, namely the t-test. Finally, the selected features are classified with different machine learning and neural network classifiers being k-nearest neighbor, Probabilistic neural network, support-vector-machine, Extreme learning machine, decision tree, Multi-layer perceptron, and linear discriminant analysis with the purpose of comparing their performance in this application. Utilizing a ten-fold cross-validation approach, tests are performed on a motor imagery dataset found in the BCI competition III. Outcomes demonstrated that the SVM classifier got the greatest classification precision of 97% when compared to the other available approaches. The entire investigative findings confirm that the suggested framework is reliable and computationally effective for the construction of BCI systems and surpasses the existing methods.Keywords: brain-computer interface, channel selection, motor imagery, support-vector-machine
Procedia PDF Downloads 50203 Caregiver Training Results in Accurate Reporting of Stool Frequency
Authors: Matthew Heidman, Susan Dallabrida, Analice Costa
Abstract:
Background:Accuracy of caregiver reported outcomes is essential for infant growth and tolerability study success. Crying/fussiness, stool consistencies, and other gastrointestinal characteristics are important parameters regarding tolerability, and inter-caregiver reporting can see a significant amount of subjectivity and vary greatly within a study, compromising data. This study sought to elucidate how caregiver reported questions related to stool frequency are answered before and after a short amount of training and how training impacts caregivers’ understanding, and how they would answer the question. Methods:A digital survey was issued for 90 daysin the US (n=121) and 30 days in Mexico (n=88), targeting respondents with children ≤4 years of age. Respondents were asked a question in two formats, first without a line of training text and second with a line of training text. The question set was as follows, “If your baby had stool in his/her diaper and you changed the diaper and 10 min later there was more stool in the diaper, how many stools would you report this as?” followed by the same question beginning with “If you were given the instruction that IF there are at least 5 minutes in between stools, then it counts as two (2) stools…”.Four response items were provided for both questions, 1) 2 stools, 2) 1stool, 3) it depends on how much stool was in the first versus the second diaper, 4) There is not enough information to be able to answer the question. Response frequencies between questions were compared. Results: Responses to the question without training saw some variability in the US, with 69% selecting “2 stools”,11% selecting “1 stool”, 14% selecting “it depends on how much stool was in the first versus the second diaper”, and 7% selecting “There is not enough information to be able to answer the question” and in Mexico respondents selected 9%, 78%, 13%, and 0% respectively. However, responses to the question after training saw more consolidation in the US, with 85% of respondents selecting“2 stools,” representing an increase in those selecting the correct answer. Additionally in Mexico, with 84% of respondents selecting “1 episode” representing an increase in the those selecting the correct response. Conclusions: Caregiver reported outcomes are critical for infant growth and tolerability studies, however, they can be highly subjective and see a high variability of responses without guidance. Training is critical to standardize all caregivers’ perspective regarding how to answer questions accurately in order to provide an accurate dataset.Keywords: infant nutrition, clinical trial optimization, stool reporting, decentralized clinical trials
Procedia PDF Downloads 96202 Ownership and Shareholder Schemes Effects on Airport Corporate Strategy in Europe
Authors: Dimitrios Dimitriou, Maria Sartzetaki
Abstract:
In the early days of the of civil aviation, airports are totally state-owned companies under the control of national authorities or regional governmental bodies. From that time the picture has totally changed and airports privatisation and airport business commercialisation are key success factors to stimulate air transport demand, generate revenues and attract investors, linked to reliable and resilience of air transport system. Nowadays, airport's corporate strategy deals with policies and actions, affecting essential the business plans, the financial targets and the economic footprint in a regional economy they serving. Therefore, exploring airport corporate strategy is essential to support the decision in business planning, management efficiency, sustainable development and investment attractiveness on one hand; and define policies towards traffic development, revenues generation, capacity expansion, cost efficiency and corporate social responsibility. This paper explores key outputs in airport corporate strategy for different ownership schemes. The airport corporations are grouped in three major schemes: (a) Public, in which the public airport operator acts as part of the government administration or as a corporised public operator; (b) Mixed scheme, in which the majority of the shares and the corporate strategy is driven by the private or the public sector; and (c) Private, in which the airport strategy is driven by the key aspects of globalisation and liberalisation of the aviation sector. By a systemic approach, the key drivers in corporate strategy for modern airport business structures are defined. Key objectives are to define the key strategic opportunities and challenges and assess the corporate goals and risks towards sustainable business development for each scheme. The analysis based on an extensive cross-sectional dataset for a sample of busy European airports providing results on corporate strategy key priorities, risks and business models. The conventional wisdom is to highlight key messages to authorities, institutes and professionals on airport corporate strategy trends and directions.Keywords: airport corporate strategy, airport ownership, airports business models, corporate risks
Procedia PDF Downloads 304201 Predicting Expectations of Non-Monogamy in Long-Term Romantic Relationships
Authors: Michelle R. Sullivan
Abstract:
Positive romantic relationships and marriages offer a buffer against a host of physical and emotional difficulties. Conversely, poor relationship quality and marital discord can have deleterious consequences for individuals and families. Research has described non-monogamy, infidelity, and consensual non-monogamy, as both consequential and causal of relationship difficulty, or as a unique way a couple strives to make a relationship work. Much research on consensual non-monogamy has built on feminist theory and critique. To the author’s best knowledge, to date, no studies have examined the predictive relationship between individual and relationship characteristics and expectations of non-monogamy. The current longitudinal study: 1) estimated the prevalence of expectations of partner non-monogamy and 2) evaluated whether gender, sexual identity, age, education, how a couple met, and relationship quality were predictive expectations of partner non-monogamy. This study utilized the publically available longitudinal dataset, How Couples Meet and Stay Together. Adults aged 18- to 98-years old (n=4002) were surveyed by phone over 5 waves from 2009-2014. Demographics and how a couple met were gathered through self-report in Wave 1, and relationship quality and expectations of partner non-monogamy were gathered through self-report in Waves 4 and 5 (n=1047). The prevalence of expectations of partner non-monogamy (encompassing both infidelity and consensual non-monogamy) was 4.8%. Logistic regression models indicated that sexual identity, gender, education, and relationship quality were significantly predictive of expectations of partner non-monogamy. Specifically, male gender, lower education, identifying as lesbian, gay, or bisexual, and a lower relationship quality scores were predictive of expectations of partner non-monogamy. Male gender was not predictive of expectations of partner non-monogamy in the follow up logistic regression model. Age and whether a couple met online were not associated with expectations of partner non-monogamy. Clinical implications include awareness of the increased likelihood of lesbian, gay, and bisexual individuals to have an expectation of non-monogamy and the sequelae of relationship dissatisfaction that may be related. Future research directions could differentiate between non-monogamy subtypes and the person and relationship variables that lead to the likelihood of consensual non-monogamy and infidelity as separate constructs, as well as explore the relationship between predicting partner behavior and actual partner behavioral outcomes.Keywords: open relationship, polyamory, infidelity, relationship satisfaction
Procedia PDF Downloads 159200 A Dataset of Program Educational Objectives Mapped to ABET Outcomes: Data Cleansing, Exploratory Data Analysis and Modeling
Authors: Addin Osman, Anwar Ali Yahya, Mohammed Basit Kamal
Abstract:
Datasets or collections are becoming important assets by themselves and now they can be accepted as a primary intellectual output of a research. The quality and usage of the datasets depend mainly on the context under which they have been collected, processed, analyzed, validated, and interpreted. This paper aims to present a collection of program educational objectives mapped to student’s outcomes collected from self-study reports prepared by 32 engineering programs accredited by ABET. The manual mapping (classification) of this data is a notoriously tedious, time consuming process. In addition, it requires experts in the area, which are mostly not available. It has been shown the operational settings under which the collection has been produced. The collection has been cleansed, preprocessed, some features have been selected and preliminary exploratory data analysis has been performed so as to illustrate the properties and usefulness of the collection. At the end, the collection has been benchmarked using nine of the most widely used supervised multiclass classification techniques (Binary Relevance, Label Powerset, Classifier Chains, Pruned Sets, Random k-label sets, Ensemble of Classifier Chains, Ensemble of Pruned Sets, Multi-Label k-Nearest Neighbors and Back-Propagation Multi-Label Learning). The techniques have been compared to each other using five well-known measurements (Accuracy, Hamming Loss, Micro-F, Macro-F, and Macro-F). The Ensemble of Classifier Chains and Ensemble of Pruned Sets have achieved encouraging performance compared to other experimented multi-label classification methods. The Classifier Chains method has shown the worst performance. To recap, the benchmark has achieved promising results by utilizing preliminary exploratory data analysis performed on the collection, proposing new trends for research and providing a baseline for future studies.Keywords: ABET, accreditation, benchmark collection, machine learning, program educational objectives, student outcomes, supervised multi-class classification, text mining
Procedia PDF Downloads 173199 Groundwater Potential Delineation Using Geodetector Based Convolutional Neural Network in the Gunabay Watershed of Ethiopia
Authors: Asnakew Mulualem Tegegne, Tarun Kumar Lohani, Abunu Atlabachew Eshete
Abstract:
Groundwater potential delineation is essential for efficient water resource utilization and long-term development. The scarcity of potable and irrigation water has become a critical issue due to natural and anthropogenic activities in meeting the demands of human survival and productivity. With these constraints, groundwater resources are now being used extensively in Ethiopia. Therefore, an innovative convolutional neural network (CNN) is successfully applied in the Gunabay watershed to delineate groundwater potential based on the selected major influencing factors. Groundwater recharge, lithology, drainage density, lineament density, transmissivity, and geomorphology were selected as major influencing factors during the groundwater potential of the study area. For dataset training, 70% of samples were selected and 30% were used for serving out of the total 128 samples. The spatial distribution of groundwater potential has been classified into five groups: very low (10.72%), low (25.67%), moderate (31.62%), high (19.93%), and very high (12.06%). The area obtains high rainfall but has a very low amount of recharge due to a lack of proper soil and water conservation structures. The major outcome of the study showed that moderate and low potential is dominant. Geodetoctor results revealed that the magnitude influences on groundwater potential have been ranked as transmissivity (0.48), recharge (0.26), lineament density (0.26), lithology (0.13), drainage density (0.12), and geomorphology (0.06). The model results showed that using a convolutional neural network (CNN), groundwater potentiality can be delineated with higher predictive capability and accuracy. CNN-based AUC validation platform showed that 81.58% and 86.84% were accrued from the accuracy of training and testing values, respectively. Based on the findings, the local government can receive technical assistance for groundwater exploration and sustainable water resource development in the Gunabay watershed. Finally, the use of a detector-based deep learning algorithm can provide a new platform for industrial sectors, groundwater experts, scholars, and decision-makers.Keywords: CNN, geodetector, groundwater influencing factors, Groundwater potential, Gunabay watershed
Procedia PDF Downloads 21198 Effects of Cold Treatments on Methylation Profiles and Reproduction Mode of Diploid and Tetraploid Plants of Ranunculus kuepferi (Ranunculaceae)
Authors: E. Syngelaki, C. C. F. Schinkel, S. Klatt, E. Hörandl
Abstract:
Environmental influence can alter the conditions for plant development and can trigger changes in epigenetic variation. Thus, the exposure to abiotic environmental stress can lead to different DNA methylation profiles and may have evolutionary consequences for adaptation. Epigenetic control mechanisms may further influence mode of reproduction. The alpine species R. kuepferi has diploid and tetraploid cytotypes, that are mostly sexual and facultative apomicts, respectively. Hence, it is a suitable model system for studying the correlations of mode of reproduction, ploidy, and environmental stress. Diploid and tetraploid individuals were placed in two climate chambers and treated with low (+7°C day/+2°C night, -1°C cold shocks for three nights per week) and warm (control) temperatures (+15°C day/+10°C night). Subsequently, methylation sensitive-Amplified Fragment-Length Polymorphism (AFPL) markers were used to screen genome-wide methylation alterations triggered by stress treatments. The dataset was analyzed for four groups regarding treatment (cold/warm) and ploidy level (diploid/tetraploid), and also separately for full methylated, hemi-methylated and unmethylated sites. Patterns of epigenetic variation suggested that diploids differed significantly in their profiles from tetraploids independent from treatment, while treatments did not differ significantly within cytotypes. Furthermore, diploids are more differentiated than the tetraploids in overall methylation profiles of both treatments. This observation is in accordance with the increased frequency of apomictic seed formation in diploids and maintenance of facultative apomixis in tetraploids during the experiment. Global analysis of molecular variance showed higher epigenetic variation within groups than among them, while locus-by-locus analysis of molecular variance showed a high number (54.7%) of significantly differentiated un-methylated loci. To summarise, epigenetic variation seems to depend on ploidy level, and in diploids may be correlated to changes in mode of reproduction. However, further studies are needed to elucidate the mechanism and possible functional significance of these correlations.Keywords: apomixis, cold stress, DNA methylation, Ranunculus kuepferi
Procedia PDF Downloads 160197 Three Issues for Integrating Artificial Intelligence into Legal Reasoning
Authors: Fausto Morais
Abstract:
Artificial intelligence has been widely used in law. Programs are able to classify suits, to identify decision-making patterns, to predict outcomes, and to formalize legal arguments as well. In Brazil, the artificial intelligence victor has been classifying cases to supreme court’s standards. When those programs act doing those tasks, they simulate some kind of legal decision and legal arguments, raising doubts about how artificial intelligence can be integrated into legal reasoning. Taking this into account, the following three issues are identified; the problem of hypernormatization, the argument of legal anthropocentrism, and the artificial legal principles. Hypernormatization can be seen in the Brazilian legal context in the Supreme Court’s usage of the Victor program. This program generated efficiency and consistency. On the other hand, there is a feasible risk of over standardizing factual and normative legal features. Then legal clerks and programmers should work together to develop an adequate way to model legal language into computational code. If this is possible, intelligent programs may enact legal decisions in easy cases automatically cases, and, in this picture, the legal anthropocentrism argument takes place. Such an argument argues that just humans beings should enact legal decisions. This is so because human beings have a conscience, free will, and self unity. In spite of that, it is possible to argue against the anthropocentrism argument and to show how intelligent programs may work overcoming human beings' problems like misleading cognition, emotions, and lack of memory. In this way, intelligent machines could be able to pass legal decisions automatically by classification, as Victor in Brazil does, because they are binding by legal patterns and should not deviate from them. Notwithstanding, artificial intelligent programs can be helpful beyond easy cases. In hard cases, they are able to identify legal standards and legal arguments by using machine learning. For that, a dataset of legal decisions regarding a particular matter must be available, which is a reality in Brazilian Judiciary. Doing such procedure, artificial intelligent programs can support a human decision in hard cases, providing legal standards and arguments based on empirical evidence. Those legal features claim an argumentative weight in legal reasoning and should serve as references for judges when they must decide to maintain or overcome a legal standard.Keywords: artificial intelligence, artificial legal principles, hypernormatization, legal anthropocentrism argument, legal reasoning
Procedia PDF Downloads 145196 Factors Impacting Geostatistical Modeling Accuracy and Modeling Strategy of Fluvial Facies Models
Authors: Benbiao Song, Yan Gao, Zhuo Liu
Abstract:
Geostatistical modeling is the key technic for reservoir characterization, the quality of geological models will influence the prediction of reservoir performance greatly, but few studies have been done to quantify the factors impacting geostatistical reservoir modeling accuracy. In this study, 16 fluvial prototype models have been established to represent different geological complexity, 6 cases range from 16 to 361 wells were defined to reproduce all those 16 prototype models by different methodologies including SIS, object-based and MPFS algorithms accompany with different constraint parameters. Modeling accuracy ratio was defined to quantify the influence of each factor, and ten realizations were averaged to represent each accuracy ratio under the same modeling condition and parameters association. Totally 5760 simulations were done to quantify the relative contribution of each factor to the simulation accuracy, and the results can be used as strategy guide for facies modeling in the similar condition. It is founded that data density, geological trend and geological complexity have great impact on modeling accuracy. Modeling accuracy may up to 90% when channel sand width reaches up to 1.5 times of well space under whatever condition by SIS and MPFS methods. When well density is low, the contribution of geological trend may increase the modeling accuracy from 40% to 70%, while the use of proper variogram may have very limited contribution for SIS method. It can be implied that when well data are dense enough to cover simple geobodies, few efforts were needed to construct an acceptable model, when geobodies are complex with insufficient data group, it is better to construct a set of robust geological trend than rely on a reliable variogram function. For object-based method, the modeling accuracy does not increase obviously as SIS method by the increase of data density, but kept rational appearance when data density is low. MPFS methods have the similar trend with SIS method, but the use of proper geological trend accompany with rational variogram may have better modeling accuracy than MPFS method. It implies that the geological modeling strategy for a real reservoir case needs to be optimized by evaluation of dataset, geological complexity, geological constraint information and the modeling objective.Keywords: fluvial facies, geostatistics, geological trend, modeling strategy, modeling accuracy, variogram
Procedia PDF Downloads 264195 A Risk Assessment Tool for the Contamination of Aflatoxins on Dried Figs Based on Machine Learning Algorithms
Authors: Kottaridi Klimentia, Demopoulos Vasilis, Sidiropoulos Anastasios, Ihara Diego, Nikolaidis Vasileios, Antonopoulos Dimitrios
Abstract:
Aflatoxins are highly poisonous and carcinogenic compounds produced by species of the genus Aspergillus spp. that can infect a variety of agricultural foods, including dried figs. Biological and environmental factors, such as population, pathogenicity, and aflatoxinogenic capacity of the strains, topography, soil, and climate parameters of the fig orchards, are believed to have a strong effect on aflatoxin levels. Existing methods for aflatoxin detection and measurement, such as high performance liquid chromatography (HPLC), and enzyme-linked immunosorbent assay (ELISA), can provide accurate results, but the procedures are usually time-consuming, sample-destructive, and expensive. Predicting aflatoxin levels prior to crop harvest is useful for minimizing the health and financial impact of a contaminated crop. Consequently, there is interest in developing a tool that predicts aflatoxin levels based on topography and soil analysis data of fig orchards. This paper describes the development of a risk assessment tool for the contamination of aflatoxin on dried figs, based on the location and altitude of the fig orchards, the population of the fungus Aspergillus spp. in the soil, and soil parameters such as pH, saturation percentage (SP), electrical conductivity (EC), organic matter, particle size analysis (sand, silt, clay), the concentration of the exchangeable cations (Ca, Mg, K, Na), extractable P, and trace of elements (B, Fe, Mn, Zn and Cu), by employing machine learning methods. In particular, our proposed method integrates three machine learning techniques, i.e., dimensionality reduction on the original dataset (principal component analysis), metric learning (Mahalanobis metric for clustering), and k-nearest neighbors learning algorithm (KNN), into an enhanced model, with mean performance equal to 85% by terms of the Pearson correlation coefficient (PCC) between observed and predicted values.Keywords: aflatoxins, Aspergillus spp., dried figs, k-nearest neighbors, machine learning, prediction
Procedia PDF Downloads 184194 Improved Computational Efficiency of Machine Learning Algorithm Based on Evaluation Metrics to Control the Spread of Coronavirus in the UK
Authors: Swathi Ganesan, Nalinda Somasiri, Rebecca Jeyavadhanam, Gayathri Karthick
Abstract:
The COVID-19 crisis presents a substantial and critical hazard to worldwide health. Since the occurrence of the disease in late January 2020 in the UK, the number of infected people confirmed to acquire the illness has increased tremendously across the country, and the number of individuals affected is undoubtedly considerably high. The purpose of this research is to figure out a predictive machine learning archetypal that could forecast COVID-19 cases within the UK. This study concentrates on the statistical data collected from 31st January 2020 to 31st March 2021 in the United Kingdom. Information on total COVID cases registered, new cases encountered on a daily basis, total death registered, and patients’ death per day due to Coronavirus is collected from World Health Organisation (WHO). Data preprocessing is carried out to identify any missing values, outliers, or anomalies in the dataset. The data is split into 8:2 ratio for training and testing purposes to forecast future new COVID cases. Support Vector Machines (SVM), Random Forests, and linear regression algorithms are chosen to study the model performance in the prediction of new COVID-19 cases. From the evaluation metrics such as r-squared value and mean squared error, the statistical performance of the model in predicting the new COVID cases is evaluated. Random Forest outperformed the other two Machine Learning algorithms with a training accuracy of 99.47% and testing accuracy of 98.26% when n=30. The mean square error obtained for Random Forest is 4.05e11, which is lesser compared to the other predictive models used for this study. From the experimental analysis Random Forest algorithm can perform more effectively and efficiently in predicting the new COVID cases, which could help the health sector to take relevant control measures for the spread of the virus.Keywords: COVID-19, machine learning, supervised learning, unsupervised learning, linear regression, support vector machine, random forest
Procedia PDF Downloads 121193 Understanding Help Seeking among Black Women with Clinically Significant Posttraumatic Stress Symptoms
Authors: Glenda Wrenn, Juliet Muzere, Meldra Hall, Allyson Belton, Kisha Holden, Chanita Hughes-Halbert, Martha Kent, Bekh Bradley
Abstract:
Understanding the help seeking decision making process and experiences of health disparity populations with posttraumatic stress disorder (PTSD) is central to development of trauma-informed, culturally centered, and patient focused services. Yet, little is known about the decision making process among adult Black women who are non-treatment seekers as they are, by definition, not engaged in services. Methods: Audiotaped interviews were conducted with 30 African American adult women with clinically significant PTSD symptoms who were engaged in primary care, but not in treatment for PTSD despite symptom burden. A qualitative interview guide was used to elucidate key themes. Independent coding of themes mapped to theory and identification of emergent themes were conducted using qualitative methods. An existing quantitative dataset was analyzed to contextualize responses and provide a descriptive summary of the sample. Results: Emergent themes revealed that active mental avoidance, the intermittent nature of distress, ambivalence, and self-identified resilience as undermining to help seeking decisions. Participants were stuck within the help-seeking phase of ‘recognition’ of illness and retained a sense of “it is my decision” despite endorsing significant social and environmental negative influencers. Participants distinguished ‘help acceptance’ from ‘help seeking’ with greater willingness to accept help and importance placed on being of help to others. Conclusions: Elucidation of the decision-making process from the perspective of non-treatment seekers has implications for outreach and treatment within models of integrated and specialty systems care. The salience of responses to trauma symptoms and stagnation in the help seeking recognition phase are findings relevant to integrated care service design and community engagement.Keywords: culture, help-seeking, integrated care, PTSD
Procedia PDF Downloads 235192 Next Generation Radiation Risk Assessment and Prediction Tools Generation Applying AI-Machine (Deep) Learning Algorithms
Authors: Selim M. Khan
Abstract:
Indoor air quality is strongly influenced by the presence of radioactive radon (222Rn) gas. Indeed, exposure to high 222Rn concentrations is unequivocally linked to DNA damage and lung cancer and is a worsening issue in North American and European built environments, having increased over time within newer housing stocks as a function of as yet unclear variables. Indoor air radon concentration can be influenced by a wide range of environmental, structural, and behavioral factors. As some of these factors are quantitative while others are qualitative, no single statistical model can determine indoor radon level precisely while simultaneously considering all these variables across a complex and highly diverse dataset. The ability of AI- machine (deep) learning to simultaneously analyze multiple quantitative and qualitative features makes it suitable to predict radon with a high degree of precision. Using Canadian and Swedish long-term indoor air radon exposure data, we are using artificial deep neural network models with random weights and polynomial statistical models in MATLAB to assess and predict radon health risk to human as a function of geospatial, human behavioral, and built environmental metrics. Our initial artificial neural network with random weights model run by sigmoid activation tested different combinations of variables and showed the highest prediction accuracy (>96%) within the reasonable iterations. Here, we present details of these emerging methods and discuss strengths and weaknesses compared to the traditional artificial neural network and statistical methods commonly used to predict indoor air quality in different countries. We propose an artificial deep neural network with random weights as a highly effective method for assessing and predicting indoor radon.Keywords: radon, radiation protection, lung cancer, aI-machine deep learnng, risk assessment, risk prediction, Europe, North America
Procedia PDF Downloads 96191 Computational Pipeline for Lynch Syndrome Detection: Integrating Alignment, Variant Calling, and Annotations
Authors: Rofida Gamal, Mostafa Mohammed, Mariam Adel, Marwa Gamal, Marwa kamal, Ayat Saber, Maha Mamdouh, Amira Emad, Mai Ramadan
Abstract:
Lynch Syndrome is an inherited genetic condition associated with an increased risk of colorectal and other cancers. Detecting Lynch Syndrome in individuals is crucial for early intervention and preventive measures. This study proposes a computational pipeline for Lynch Syndrome detection by integrating alignment, variant calling, and annotation. The pipeline leverages popular tools such as FastQC, Trimmomatic, BWA, bcftools, and ANNOVAR to process the input FASTQ file, perform quality trimming, align reads to the reference genome, call variants, and annotate them. It is believed that the computational pipeline was applied to a dataset of Lynch Syndrome cases, and its performance was evaluated. It is believed that the quality check step ensured the integrity of the sequencing data, while the trimming process is thought to have removed low-quality bases and adaptors. In the alignment step, it is believed that the reads were accurately mapped to the reference genome, and the subsequent variant calling step is believed to have identified potential genetic variants. The annotation step is believed to have provided functional insights into the detected variants, including their effects on known Lynch Syndrome-associated genes. The results obtained from the pipeline revealed Lynch Syndrome-related positions in the genome, providing valuable information for further investigation and clinical decision-making. The pipeline's effectiveness was demonstrated through its ability to streamline the analysis workflow and identify potential genetic markers associated with Lynch Syndrome. It is believed that the computational pipeline presents a comprehensive and efficient approach to Lynch Syndrome detection, contributing to early diagnosis and intervention. The modularity and flexibility of the pipeline are believed to enable customization and adaptation to various datasets and research settings. Further optimization and validation are believed to be necessary to enhance performance and applicability across diverse populations.Keywords: Lynch Syndrome, computational pipeline, alignment, variant calling, annotation, genetic markers
Procedia PDF Downloads 76190 Speckle-Based Phase Contrast Micro-Computed Tomography with Neural Network Reconstruction
Authors: Y. Zheng, M. Busi, A. F. Pedersen, M. A. Beltran, C. Gundlach
Abstract:
X-ray phase contrast imaging has shown to yield a better contrast compared to conventional attenuation X-ray imaging, especially for soft tissues in the medical imaging energy range. This can potentially lead to better diagnosis for patients. However, phase contrast imaging has mainly been performed using highly brilliant Synchrotron radiation, as it requires high coherence X-rays. Many research teams have demonstrated that it is also feasible using a laboratory source, bringing it one step closer to clinical use. Nevertheless, the requirement of fine gratings and high precision stepping motors when using a laboratory source prevents it from being widely used. Recently, a random phase object has been proposed as an analyzer. This method requires a much less robust experimental setup. However, previous studies were done using a particular X-ray source (liquid-metal jet micro-focus source) or high precision motors for stepping. We have been working on a much simpler setup with just small modification of a commercial bench-top micro-CT (computed tomography) scanner, by introducing a piece of sandpaper as the phase analyzer in front of the X-ray source. However, it needs a suitable algorithm for speckle tracking and 3D reconstructions. The precision and sensitivity of speckle tracking algorithm determine the resolution of the system, while the 3D reconstruction algorithm will affect the minimum number of projections required, thus limiting the temporal resolution. As phase contrast imaging methods usually require much longer exposure time than traditional absorption based X-ray imaging technologies, a dynamic phase contrast micro-CT with a high temporal resolution is particularly challenging. Different reconstruction methods, including neural network based techniques, will be evaluated in this project to increase the temporal resolution of the phase contrast micro-CT. A Monte Carlo ray tracing simulation (McXtrace) was used to generate a large dataset to train the neural network, in order to address the issue that neural networks require large amount of training data to get high-quality reconstructions.Keywords: micro-ct, neural networks, reconstruction, speckle-based x-ray phase contrast
Procedia PDF Downloads 257189 3-Dimensional Contamination Conceptual Site Model: A Case Study Illustrating the Multiple Applications of Developing and Maintaining a 3D Contamination Model during an Active Remediation Project on a Former Urban Gasworks Site
Authors: Duncan Fraser
Abstract:
A 3-Dimensional (3D) conceptual site model was developed using the Leapfrog Works® platform utilising a comprehensive historical dataset for a large former Gasworks site in Fitzroy, Melbourne. The gasworks had been constructed across two fractured geological units with varying hydraulic conductivities. A Newer Volcanic (basaltic) outcrop covered approximately half of the site and was overlying a fractured Melbourne formation (Siltstone) bedrock outcropping over the remaining portion. During the investigative phase of works, a dense non-aqueous phase liquid (DNAPL) plume (coal tar) was identified within both geological units in the subsurface originating from multiple sources, including gasholders, tar wells, condensers, and leaking pipework. The first stage of model development was undertaken to determine the horizontal and vertical extents of the coal tar in the subsurface and assess the potential causality between potential sources, plume location, and site geology. Concentrations of key contaminants of interest (COIs) were also interpolated within Leapfrog to refine the distribution of contaminated soils. The model was subsequently used to develop a robust soil remediation strategy and achieve endorsement from an Environmental Auditor. A change in project scope, following the removal and validation of the three former gasholders, necessitated the additional excavation of a significant volume of residual contaminated rock to allow for the future construction of two-story underground basements. To assess financial liabilities associated with the offsite disposal or thermal treatment of material, the 3D model was updated with three years of additional analytical data from the active remediation phase of works. Chemical concentrations and the residual tar plume within the rock fractures were modelled to pre-classify the in-situ material and enhance separation strategies to prevent the unnecessary treatment of material and reduce costs.Keywords: 3D model, contaminated land, Leapfrog, remediation
Procedia PDF Downloads 133188 Comparing Quality of Care in Family Planning Services in Primary Public and Private Health Care Facilities in Ethiopia
Authors: Gizachew Assefa Tessema, Mohammad Afzal Mahmood, Judith Streak Gomersall, Caroline O. Laurence
Abstract:
Introduction: Improving access to quality family planning services is the key to improving health of women and children. However, there is currently little evidence on the quality and scope of family planning services provided by private facilities, and this compares to the services provided in public facilities in Ethiopia. This is important, particularly in determining whether the government should further expand the roles of the private sector in the delivery of family planning facility. Methods: This study used the 2014 Ethiopian Services Provision Assessment Plus (ESPA+) survey dataset for comparing the structural aspects of quality of care in family planning services. The present analysis used a weighted sample of 1093 primary health care facilities (955 public and 138 private). This study employed logistic regression analysis to compare key structural variables between public and private facilities. While taking the structural variables as an outcome for comparison, the facility type (public vs private) were used as the key exposure of interest. Results: When comparing availability of basic amenities (infrastructure), public facilities were less likely to have functional cell phones (AOR=0.12; 95% CI: 0.07-0.21), and water supply (AOR=0.29; 95% CI: 0.15-0.58) than private facilities. However, public facilities were more likely to have staff available 24 hours in the facility (AOR=0.12; 95% CI: 0.07-0.21), providers having family planning related training in the past 24 months (AOR=4.4; 95% CI: 2.51, 7.64) and possessing guidelines/protocols (AOR= 3.1 95% CI: 1.87, 5.24) than private facilities. Moreover, comparing the availability of equipment, public facilities had higher odds of having pelvic model for IUD demonstration (AOR=2.60; 95% CI: 1.35, 5.01) and penile model for condom demonstration (AOR=2.51; 95% CI: 1.32, 4.78) than private facilities. Conclusion: The present study suggests that Ethiopian government needs to provide emphasis towards the private sector in terms of providing family planning guidelines and training on family planning services for their staff. It is also worthwhile for the public health facilities to allocate funding for improving the availability of basic amenities. Implications for policy and/ or practice: This study calls policy makers to design appropriate strategies in providing opportunities for training a health care providers working in private health facility.Keywords: quality of care, family planning, public-private, Ethiopia
Procedia PDF Downloads 353187 Parametric Approach for Reserve Liability Estimate in Mortgage Insurance
Authors: Rajinder Singh, Ram Valluru
Abstract:
Chain Ladder (CL) method, Expected Loss Ratio (ELR) method and Bornhuetter-Ferguson (BF) method, in addition to more complex transition-rate modeling, are commonly used actuarial reserving methods in general insurance. There is limited published research about their relative performance in the context of Mortgage Insurance (MI). In our experience, these traditional techniques pose unique challenges and do not provide stable claim estimates for medium to longer term liabilities. The relative strengths and weaknesses among various alternative approaches revolve around: stability in the recent loss development pattern, sufficiency and reliability of loss development data, and agreement/disagreement between reported losses to date and ultimate loss estimate. CL method results in volatile reserve estimates, especially for accident periods with little development experience. The ELR method breaks down especially when ultimate loss ratios are not stable and predictable. While the BF method provides a good tradeoff between the loss development approach (CL) and ELR, the approach generates claim development and ultimate reserves that are disconnected from the ever-to-date (ETD) development experience for some accident years that have more development experience. Further, BF is based on subjective a priori assumption. The fundamental shortcoming of these methods is their inability to model exogenous factors, like the economy, which impact various cohorts at the same chronological time but at staggered points along their life-time development. This paper proposes an alternative approach of parametrizing the loss development curve and using logistic regression to generate the ultimate loss estimate for each homogeneous group (accident year or delinquency period). The methodology was tested on an actual MI claim development dataset where various cohorts followed a sigmoidal trend, but levels varied substantially depending upon the economic and operational conditions during the development period spanning over many years. The proposed approach provides the ability to indirectly incorporate such exogenous factors and produce more stable loss forecasts for reserving purposes as compared to the traditional CL and BF methods.Keywords: actuarial loss reserving techniques, logistic regression, parametric function, volatility
Procedia PDF Downloads 131186 The Role and Effects of Communication on Occupational Safety: A Review
Authors: Pieter A. Cornelissen, Joris J. Van Hoof
Abstract:
The interest in improving occupational safety started almost simultaneously with the beginning of the Industrial Revolution. Yet, it was not until the late 1970’s before the role of communication was considered in scientific research regarding occupational safety. In recent years the importance of communication as a means to improve occupational safety has increased. Not only as communication might have a direct effect on safety performance and safety outcomes, but also as it can be viewed as a major component of other important safety-related elements (e.g., training, safety meetings, leadership). And while safety communication is an increasingly important topic in research, its operationalization is often vague and differs among studies. This is not only problematic when comparing results, but also in applying these results to practice and the work floor. By means of an in-depth analysis—building on an existing dataset—this review aims to overcome these problems. The initial database search yielded 25.527 articles, which was reduced to a research corpus of 176 articles. Focusing on the 37 articles of this corpus that addressed communication (related to safety outcomes and safety performance), the current study will provide a comprehensive overview of the role and effects of safety communication and outlines the conditions under which communication contributes to a safer work environment. The study shows that in literature a distinction is commonly made between safety communication (i.e., the exchange or dissemination of safety-related information) and feedback (i.e. a reactive form of communication). And although there is a consensus among researchers that both communication and feedback positively affect safety performance, there is a debate about the directness of this relationship. Whereas some researchers assume a direct relationship between safety communication and safety performance, others state that this relationship is mediated by safety climate. One of the key findings is that despite the strongly present view that safety communication is a formal and top-down safety management tool, researchers stress the importance of open communication that encourages and allows employees to express their worries, experiences, views, and share information. This raises questions with regard to other directions (e.g., bottom-up, horizontal) and forms of communication (e.g., informal). The current review proposes a framework to overcome the often vague and different operationalizations of safety communication. The proposed framework can be used to characterize safety communication in terms of stakeholders, direction, and characteristics of communication (e.g., medium usage).Keywords: communication, feedback, occupational safety, review
Procedia PDF Downloads 302185 Geospatial Multi-Criteria Evaluation to Predict Landslide Hazard Potential in the Catchment of Lake Naivasha, Kenya
Authors: Abdel Rahman Khider Hassan
Abstract:
This paper describes a multi-criteria geospatial model for prediction of landslide hazard zonation (LHZ) for Lake Naivasha catchment (Kenya), based on spatial analysis of integrated datasets of location intrinsic parameters (slope stability factors) and external landslides triggering factors (natural and man-made factors). The intrinsic dataset included: lithology, geometry of slope (slope inclination, aspect, elevation, and curvature) and land use/land cover. The landslides triggering factors included: rainfall as the climatic factor, in addition to the destructive effects reflected by proximity of roads and drainage network to areas that are susceptible to landslides. No published study on landslides has been obtained for this area. Thus, digital datasets of the above spatial parameters were conveniently acquired, stored, manipulated and analyzed in a Geographical Information System (GIS) using a multi-criteria grid overlay technique (in ArcGIS 10.2.2 environment). Deduction of landslide hazard zonation is done by applying weights based on relative contribution of each parameter to the slope instability, and finally, the weighted parameters grids were overlaid together to generate a map of the potential landslide hazard zonation (LHZ) for the lake catchment. From the total surface of 3200 km² of the lake catchment, most of the region (78.7 %; 2518.4 km²) is susceptible to moderate landslide hazards, whilst about 13% (416 km²) is occurring under high hazards. Only 1.0% (32 km²) of the catchment is displaying very high landslide hazards, and the remaining area (7.3 %; 233.6 km²) displays low probability of landslide hazards. This result confirms the importance of steep slope angles, lithology, vegetation land cover and slope orientation (aspect) as the major determining factors of slope failures. The information provided by the produced map of landslide hazard zonation (LHZ) could lay the basis for decision making as well as mitigation and applications in avoiding potential losses caused by landslides in the Lake Naivasha catchment in the Kenya Highlands.Keywords: decision making, geospatial, landslide, multi-criteria, Naivasha
Procedia PDF Downloads 206184 Private and Public Health Sector Difference on Client Satisfaction: Results from Secondary Data Analysis in Sindh, Pakistan
Authors: Wajiha Javed, Arsalan Jabbar, Nelofer Mehboob, Muhammad Tafseer, Zahid Memon
Abstract:
Introduction: Researchers globally have strived to explore diverse factors that augment the continuation and uptake of family planning methods. Clients’ satisfaction is one of the core determinants facilitating continuation of family planning methods. There is a major debate yet scanty evidence to contrast public and private sectors with respect to client satisfaction. The objective of this study is to compare quality-of-care provided by public and private sectors of Pakistan through a client satisfaction lens. Methods: We used Pakistan Demographic Heath Survey 2012-13 dataset (Sindh province) on a total of 3133 Married Women of Reproductive Age (MWRA) aged 15-49 years. Source of family planning (public/private sector) was the main exposure variable. Outcome variable was client satisfaction judged by ten different dimensions of client satisfaction. Means and standard deviations were calculated for continuous variable while for categorical variable frequencies and percentages were computed. For univariate analysis, Chi-square/Fisher Exact test was used to find an association between clients’ satisfaction in public and private sectors. Ten different multivariate models were made. Variables were checked for multi-collinearity, confounding, and interaction, and then advanced logistic regression was used to explore the relationship between client satisfaction and dependent outcome after adjusting for all known confounding factors and results are presented as OR and AOR (95% CI). Results: Multivariate analyses showed that clients were less satisfied in contraceptive provision from private sector as compared to public sector (AOR 0.92,95% CI 0.63-1.68) even though the result was not statistically significant. Clients were more satisfied from private sector as compared to the public sector with respect to other determinants of quality-of-care (follow-up care (AOR 3.29, 95% CI 1.95-5.55), infection prevention (AOR 2.41, 95% CI 1.60-3.62), counseling services (AOR 2.01, 95% CI 1.27-3.18, timely treatment (AOR 3.37, 95% CI 2.20-5.15), attitude of staff (AOR 2.23, 95% CI 1.50-3.33), punctuality of staff (AOR 2.28, 95% CI 1.92-4.13), timely referring (AOR 2.34, 95% CI 1.63-3.35), staff cooperation (AOR 1.75, 95% CI 1.22-2.51) and complications handling (AOR 2.27, 95% CI 1.56-3.29).Keywords: client satisfaction, family planning, public private partnership, quality of care
Procedia PDF Downloads 419183 Management as a Proxy for Firm Quality
Authors: Petar Dobrev
Abstract:
There is no agreed-upon definition of firm quality. While profitability and stock performance often qualify as popular proxies of quality, in this project, we aim to identify quality without relying on a firm’s financial statements or stock returns as selection criteria. Instead, we use firm-level data on management practices across small to medium-sized U.S. manufacturing firms from the World Management Survey (WMS) to measure firm quality. Each firm in the WMS dataset is assigned a mean management score from 0 to 5, with higher scores identifying better-managed firms. This management score serves as our proxy for firm quality and is the sole criteria we use to separate firms into portfolios comprised of high-quality and low-quality firms. We define high-quality (low-quality) firms as those firms with a management score of one standard deviation above (below) the mean. To study whether this proxy for firm quality can identify better-performing firms, we link this data to Compustat and The Center for Research in Security Prices (CRSP) to obtain firm-level data on financial performance and monthly stock returns, respectively. We find that from 1999 to 2019 (our sample data period), firms in the high-quality portfolio are consistently more profitable — higher operating profitability and return on equity compared to low-quality firms. In addition, high-quality firms also exhibit a lower risk of bankruptcy — a higher Altman Z-score. Next, we test whether the stocks of the firms in the high-quality portfolio earn superior risk-adjusted excess returns. We regress the monthly excess returns on each portfolio on the Fama-French 3-factor, 4-factor, and 5-factor models, the betting-against-beta factor, and the quality-minus-junk factor. We find no statistically significant differences in excess returns between both portfolios, suggesting that stocks of high-quality (well managed) firms do not earn superior risk-adjusted returns compared to low-quality (poorly managed) firms. In short, our proxy for firm quality, the WMS management score, can identify firms with superior financial performance (higher profitability and reduced risk of bankruptcy). However, our management proxy cannot identify stocks that earn superior risk-adjusted returns, suggesting no statistically significant relationship between managerial quality and stock performance.Keywords: excess stock returns, management, profitability, quality
Procedia PDF Downloads 93182 Methodology for the Multi-Objective Analysis of Data Sets in Freight Delivery
Authors: Dale Dzemydiene, Aurelija Burinskiene, Arunas Miliauskas, Kristina Ciziuniene
Abstract:
Data flow and the purpose of reporting the data are different and dependent on business needs. Different parameters are reported and transferred regularly during freight delivery. This business practices form the dataset constructed for each time point and contain all required information for freight moving decisions. As a significant amount of these data is used for various purposes, an integrating methodological approach must be developed to respond to the indicated problem. The proposed methodology contains several steps: (1) collecting context data sets and data validation; (2) multi-objective analysis for optimizing freight transfer services. For data validation, the study involves Grubbs outliers analysis, particularly for data cleaning and the identification of statistical significance of data reporting event cases. The Grubbs test is often used as it measures one external value at a time exceeding the boundaries of standard normal distribution. In the study area, the test was not widely applied by authors, except when the Grubbs test for outlier detection was used to identify outsiders in fuel consumption data. In the study, the authors applied the method with a confidence level of 99%. For the multi-objective analysis, the authors would like to select the forms of construction of the genetic algorithms, which have more possibilities to extract the best solution. For freight delivery management, the schemas of genetic algorithms' structure are used as a more effective technique. Due to that, the adaptable genetic algorithm is applied for the description of choosing process of the effective transportation corridor. In this study, the multi-objective genetic algorithm methods are used to optimize the data evaluation and select the appropriate transport corridor. The authors suggest a methodology for the multi-objective analysis, which evaluates collected context data sets and uses this evaluation to determine a delivery corridor for freight transfer service in the multi-modal transportation network. In the multi-objective analysis, authors include safety components, the number of accidents a year, and freight delivery time in the multi-modal transportation network. The proposed methodology has practical value in the management of multi-modal transportation processes.Keywords: multi-objective, analysis, data flow, freight delivery, methodology
Procedia PDF Downloads 180181 Housing Price Dynamics: Comparative Study of 1980-1999 and the New Millenium
Authors: Janne Engblom, Elias Oikarinen
Abstract:
The understanding of housing price dynamics is of importance to a great number of agents: to portfolio investors, banks, real estate brokers and construction companies as well as to policy makers and households. A panel dataset is one that follows a given sample of individuals over time, and thus provides multiple observations on each individual in the sample. Panel data models include a variety of fixed and random effects models which form a wide range of linear models. A special case of panel data models is dynamic in nature. A complication regarding a dynamic panel data model that includes the lagged dependent variable is endogeneity bias of estimates. Several approaches have been developed to account for this problem. In this paper, the panel models were estimated using the Common Correlated Effects estimator (CCE) of dynamic panel data which also accounts for cross-sectional dependence which is caused by common structures of the economy. In presence of cross-sectional dependence standard OLS gives biased estimates. In this study, U.S housing price dynamics were examined empirically using the dynamic CCE estimator with first-difference of housing price as the dependent and first-differences of per capita income, interest rate, housing stock and lagged price together with deviation of housing prices from their long-run equilibrium level as independents. These deviations were also estimated from the data. The aim of the analysis was to provide estimates with comparisons of estimates between 1980-1999 and 2000-2012. Based on data of 50 U.S cities over 1980-2012 differences of short-run housing price dynamics estimates were mostly significant when two time periods were compared. Significance tests of differences were provided by the model containing interaction terms of independents and time dummy variable. Residual analysis showed very low cross-sectional correlation of the model residuals compared with the standard OLS approach. This means a good fit of CCE estimator model. Estimates of the dynamic panel data model were in line with the theory of housing price dynamics. Results also suggest that dynamics of a housing market is evolving over time.Keywords: dynamic model, panel data, cross-sectional dependence, interaction model
Procedia PDF Downloads 251180 Developing A Third Degree Of Freedom For Opinion Dynamics Models Using Scales
Authors: Dino Carpentras, Alejandro Dinkelberg, Michael Quayle
Abstract:
Opinion dynamics models use an agent-based modeling approach to model people’s opinions. Model's properties are usually explored by testing the two 'degrees of freedom': the interaction rule and the network topology. The latter defines the connection, and thus the possible interaction, among agents. The interaction rule, instead, determines how agents select each other and update their own opinion. Here we show the existence of the third degree of freedom. This can be used for turning one model into each other or to change the model’s output up to 100% of its initial value. Opinion dynamics models represent the evolution of real-world opinions parsimoniously. Thus, it is fundamental to know how real-world opinion (e.g., supporting a candidate) could be turned into a number. Specifically, we want to know if, by choosing a different opinion-to-number transformation, the model’s dynamics would be preserved. This transformation is typically not addressed in opinion dynamics literature. However, it has already been studied in psychometrics, a branch of psychology. In this field, real-world opinions are converted into numbers using abstract objects called 'scales.' These scales can be converted one into the other, in the same way as we convert meters to feet. Thus, in our work, we analyze how this scale transformation may affect opinion dynamics models. We perform our analysis both using mathematical modeling and validating it via agent-based simulations. To distinguish between scale transformation and measurement error, we first analyze the case of perfect scales (i.e., no error or noise). Here we show that a scale transformation may change the model’s dynamics up to a qualitative level. Meaning that a researcher may reach a totally different conclusion, even using the same dataset just by slightly changing the way data are pre-processed. Indeed, we quantify that this effect may alter the model’s output by 100%. By using two models from the standard literature, we show that a scale transformation can transform one model into the other. This transformation is exact, and it holds for every result. Lastly, we also test the case of using real-world data (i.e., finite precision). We perform this test using a 7-points Likert scale, showing how even a small scale change may result in different predictions or a number of opinion clusters. Because of this, we think that scale transformation should be considered as a third-degree of freedom for opinion dynamics. Indeed, its properties have a strong impact both on theoretical models and for their application to real-world data.Keywords: degrees of freedom, empirical validation, opinion scale, opinion dynamics
Procedia PDF Downloads 155179 Characterizing Nasal Microbiota in COVID-19 Patients: Insights from Nanopore Technology and Comparative Analysis
Authors: David Pinzauti, Simon De Jaegher, Maria D'Aguano, Manuele Biazzo
Abstract:
The COVID-19 pandemic has left an indelible mark on global health, leading to a pressing need for understanding the intricate interactions between the virus and the human microbiome. This study focuses on characterizing the nasal microbiota of patients affected by COVID-19, with a specific emphasis on the comparison with unaffected individuals, to shed light on the crucial role of the microbiome in the development of this viral disease. To achieve this objective, Nanopore technology was employed to analyze the bacterial 16s rRNA full-length gene present in nasal swabs collected in Malta between January 2021 and August 2022. A comprehensive dataset consisting of 268 samples (126 SARS-negative samples and 142 SARS-positive samples) was subjected to a comparative analysis using an in-house, custom pipeline. The findings from this study revealed that individuals affected by COVID-19 possess a nasal microbiota that is significantly less diverse, as evidenced by lower α diversity, and is characterized by distinct microbial communities compared to unaffected individuals. The beta diversity analyses were carried out at different taxonomic resolutions. At the phylum level, Bacteroidota was found to be more prevalent in SARS-negative samples, suggesting a potential decrease during the course of viral infection. At the species level, the identification of several specific biomarkers further underscores the critical role of the nasal microbiota in COVID-19 pathogenesis. Notably, species such as Finegoldia magna, Moraxella catarrhalis, and others exhibited relative abundance in SARS-positive samples, potentially serving as significant indicators of the disease. This study presents valuable insights into the relationship between COVID-19 and the nasal microbiota. The identification of distinct microbial communities and potential biomarkers associated with the disease offers promising avenues for further research and therapeutic interventions aimed at enhancing public health outcomes in the context of COVID-19.Keywords: COVID-19, nasal microbiota, nanopore technology, 16s rRNA gene, biomarkers
Procedia PDF Downloads 68178 High-Throughput Artificial Guide RNA Sequence Design for Type I, II and III CRISPR/Cas-Mediated Genome Editing
Authors: Farahnaz Sadat Golestan Hashemi, Mohd Razi Ismail, Mohd Y. Rafii
Abstract:
A huge revolution has emerged in genome engineering by the discovery of CRISPR (clustered regularly interspaced palindromic repeats) and CRISPR-associated system genes (Cas) in bacteria. The function of type II Streptococcus pyogenes (Sp) CRISPR/Cas9 system has been confirmed in various species. Other S. thermophilus (St) CRISPR-Cas systems, CRISPR1-Cas and CRISPR3-Cas, have been also reported for preventing phage infection. The CRISPR1-Cas system interferes by cleaving foreign dsDNA entering the cell in a length-specific and orientation-dependant manner. The S. thermophilus CRISPR3-Cas system also acts by cleaving phage dsDNA genomes at the same specific position inside the targeted protospacer as observed in the CRISPR1-Cas system. It is worth mentioning, for the effective DNA cleavage activity, RNA-guided Cas9 orthologs require their own specific PAM (protospacer adjacent motif) sequences. Activity levels are based on the sequence of the protospacer and specific combinations of favorable PAM bases. Therefore, based on the specific length and sequence of PAM followed by a constant length of target site for the three orthogonals of Cas9 protein, a well-organized procedure will be required for high-throughput and accurate mining of possible target sites in a large genomic dataset. Consequently, we created a reliable procedure to explore potential gRNA sequences for type I (Streptococcus thermophiles), II (Streptococcus pyogenes), and III (Streptococcus thermophiles) CRISPR/Cas systems. To mine CRISPR target sites, four different searching modes of sgRNA binding to target DNA strand were applied. These searching modes are as follows: i) coding strand searching, ii) anti-coding strand searching, iii) both strand searching, and iv) paired-gRNA searching. The output of such procedure highlights the power of comparative genome mining for different CRISPR/Cas systems. This could yield a repertoire of Cas9 variants with expanded capabilities of gRNA design, and will pave the way for further advance genome and epigenome engineering.Keywords: CRISPR/Cas systems, gRNA mining, Streptococcus pyogenes, Streptococcus thermophiles
Procedia PDF Downloads 257