Search results for: genetic algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3376

Search results for: genetic algorithms

2206 Machine Learning and Deep Learning Approach for People Recognition and Tracking in Crowd for Safety Monitoring

Authors: A. Degale Desta, Cheng Jian

Abstract:

Deep learning application in computer vision is rapidly advancing, giving it the ability to monitor the public and quickly identify potentially anomalous behaviour from crowd scenes. Therefore, the purpose of the current work is to improve the performance of safety of people in crowd events from panic behaviour through introducing the innovative idea of Aggregation of Ensembles (AOE), which makes use of the pre-trained ConvNets and a pool of classifiers to find anomalies in video data with packed scenes. According to the theory of algorithms that applied K-means, KNN, CNN, SVD, and Faster-CNN, YOLOv5 architectures learn different levels of semantic representation from crowd videos; the proposed approach leverages an ensemble of various fine-tuned convolutional neural networks (CNN), allowing for the extraction of enriched feature sets. In addition to the above algorithms, a long short-term memory neural network to forecast future feature values and a handmade feature that takes into consideration the peculiarities of the crowd to understand human behavior. On well-known datasets of panic situations, experiments are run to assess the effectiveness and precision of the suggested method. Results reveal that, compared to state-of-the-art methodologies, the system produces better and more promising results in terms of accuracy and processing speed.

Keywords: action recognition, computer vision, crowd detecting and tracking, deep learning

Procedia PDF Downloads 156
2205 Pharmacogenetics of P2Y12 Receptor Inhibitors

Authors: Ragy Raafat Gaber Attaalla

Abstract:

For cardiovascular illness, oral P2Y12 inhibitors including clopidogrel, prasugrel, and ticagrelor are frequently recommended. Each of these medications has advantages and disadvantages. In the absence of genotyping, it has been demonstrated that the stronger platelet aggregation inhibitors prasugrel and ticagrelor are superior than clopidogrel at preventing significant adverse cardiovascular events following an acute coronary syndrome and percutaneous coronary intervention (PCI). Both, nevertheless, come with a higher risk of bleeding unrelated to a coronary artery bypass. As a prodrug, clopidogrel needs to be bioactivated, principally by the CYP2C19 enzyme. A CYP2C19 no function allele and diminished or absent CYP2C19 enzyme activity are present in about 30% of people. The reduced exposure to the active metabolite of clopidogrel and reduced inhibition of platelet aggregation among clopidogrel-treated carriers of a CYP2C19 no function allele likely contributed to the reduced efficacy of clopidogrel in clinical trials. Clopidogrel's pharmacogenetic results are strongest when used in conjunction with PCI, but evidence for other indications is growing. One of the most typical examples of clinical pharmacogenetic application is CYP2C19 genotype-guided antiplatelet medication following PCI. Guidance is available from expert consensus groups and regulatory bodies to assist with incorporating genetic information into P2Y12 inhibitor prescribing decisions. Here, we examine the data supporting genotype-guided P2Y12 inhibitor selection's effects on clopidogrel response and outcomes and discuss tips for pharmacogenetic implementation. We also discuss procedures for using genotype data to choose P2Y12 inhibitor therapies as well as any unmet research needs. Finally, choosing a P2Y12 inhibitor medication that optimally balances the atherothrombotic and bleeding risks may be influenced by both clinical and genetic factors.

Keywords: inhibitors, cardiovascular events, coronary intervention, pharmacogenetic implementation

Procedia PDF Downloads 106
2204 Exploring the Role of Phosphorylation on the β-lactamase Activity of OXA24/40

Authors: Dharshika Rajalingam, Jeffery W. Peng

Abstract:

Acinetobacter baumannii is a challenging threat to global health, recognized as a multidrug-resistant pathogen. -lactamase is one of the principal resistant mechanisms developed by A. baumannii to survive against -lactam antibiotics. OXA24/40 is one of the types of -lactamases, a well-documented carbapenem hydrolyzing class D -lactamases (CHDL). It was revealed that OXA24/40 showed resistivity against doripenem, one of the carbapenems, by two different mechanisms as hydrolysis and -lactonization. Furthermore, it undergoes genetic mutations to broaden the -lactamase activity to survive against antibiotic environments. One of the crucial characterizations of prokaryotes to develop adaptation is post-translational modification (PTM), mainly phosphorylation. However, the PTM of OXA24/40 is an unknown feature, and the impact of PTM on antibiotic resistivity is yet to be explored. We approached these hypotheses using NMR and MS techniques and found that the OXA24/40 could be phosphorylated in vitro. The Ser81 at the active STFK motif of OXA24/40 of catalytic pocket was identified as the site of phosphorylation using 1D 31P NMR experiment, whereas S81 is required to form an acyl-enzyme complex between enzyme and -lactam antibiotics. The activity of completely phosphorylated OXA24/40 wild type against doripenem revealed that the phosphorylation of active Ser inactivates the -lactamases activity of OXA24/40. The 1D 1H CPMG NMR-based activity assay of phosphorylated OXA24/40 against doripenem confirmed that both deactivating mechanisms are inhibited by phosphorylation. Carbamylated Lysine at the active STFK motif is one of the critical features of CHDL required for the acylation and deacylation reactions of the enzyme. The 1D 13C NMR experiment confirmed that the K84 of phosphorylated OXA24/40 is de-carbamylated. Phosphorylation of OXA24/40 affects both active S81 and carbamylated K84 of OXA24 that are required for the resistivity of -lactamase. So, phosphorylation could be one of the reasons for the genetic mutation of OXA24/40 for the development of antibiotic resistivity. Further research can lead to an understanding of the effect of phosphorylation on the clinical mutants of the OXA24-like -lactamase family on the broadening of -lactamase activity.

Keywords: OXA24/40, phosphorylation, clinical mutants, resistivity

Procedia PDF Downloads 74
2203 Analysis of Differentially Expressed Genes in Spontaneously Occurring Canine Melanoma

Authors: Simona Perga, Chiara Beltramo, Floriana Fruscione, Isabella Martini, Federica Cavallo, Federica Riccardo, Paolo Buracco, Selina Iussich, Elisabetta Razzuoli, Katia Varello, Lorella Maniscalco, Elena Bozzetta, Angelo Ferrari, Paola Modesto

Abstract:

Introduction: Human and canine melanoma have common clinical, histologic characteristics making dogs a good model for comparative oncology. The identification of specific genes and a better understanding of the genetic landscape, signaling pathways, and tumor–microenvironmental interactions involved in the cancer onset and progression is essential for the development of therapeutic strategies against this tumor in both species. In the present study, the differential expression of genes in spontaneously occurring canine melanoma and in paired normal tissue was investigated by targeted RNAseq. Material and Methods: Total RNA was extracted from 17 canine malignant melanoma (CMM) samples and from five paired normal tissues stored in RNA-later. In order to capture the greater genetic variability, gene expression analysis was carried out using two panels (Qiagen): Human Immuno-Oncology (HIO) and Mouse-Immuno-Oncology (MIO) and the miSeq platform (Illumina). These kits allow the detection of the expression profile of 990 genes involved in the immune response against tumors in humans and mice. The data were analyzed through the CLCbio Genomics Workbench (Qiagen) software using the Canis lupus familiaris genome as a reference. Data analysis were carried out both comparing the biologic group (tumoral vs. healthy tissues) and comparing neoplastic tissue vs. paired healthy tissue; a Fold Change greater than two and a p-value less than 0.05 were set as the threshold to select interesting genes. Results and Discussion: Using HIO 63, down-regulated genes were detected; 13 of those were also down-regulated comparing neoplastic sample vs. paired healthy tissue. Eighteen genes were up-regulated, 14 of those were also down-regulated comparing neoplastic sample vs. paired healthy tissue. Using the MIO, 35 down regulated-genes were detected; only four of these were down-regulated, also comparing neoplastic sample vs. paired healthy tissue. Twelve genes were up-regulated in both types of analysis. Considering the two kits, the greatest variation in Fold Change was in up-regulated genes. Dogs displayed a greater genetic homology with humans than mice; moreover, the results have shown that the two kits are able to detect different genes. Most of these genes have specific cellular functions or belong to some enzymatic categories; some have already been described to be correlated to human melanoma and confirm the validity of the dog as a model for the study of molecular aspects of human melanoma.

Keywords: animal model, canine melanoma, gene expression, spontaneous tumors, targeted RNAseq

Procedia PDF Downloads 193
2202 Melanoma and Non-Melanoma, Skin Lesion Classification, Using a Deep Learning Model

Authors: Shaira L. Kee, Michael Aaron G. Sy, Myles Joshua T. Tan, Hezerul Abdul Karim, Nouar AlDahoul

Abstract:

Skin diseases are considered the fourth most common disease, with melanoma and non-melanoma skin cancer as the most common type of cancer in Caucasians. The alarming increase in Skin Cancer cases shows an urgent need for further research to improve diagnostic methods, as early diagnosis can significantly improve the 5-year survival rate. Machine Learning algorithms for image pattern analysis in diagnosing skin lesions can dramatically increase the accuracy rate of detection and decrease possible human errors. Several studies have shown the diagnostic performance of computer algorithms outperformed dermatologists. However, existing methods still need improvements to reduce diagnostic errors and generate efficient and accurate results. Our paper proposes an ensemble method to classify dermoscopic images into benign and malignant skin lesions. The experiments were conducted using the International Skin Imaging Collaboration (ISIC) image samples. The dataset contains 3,297 dermoscopic images with benign and malignant categories. The results show improvement in performance with an accuracy of 88% and an F1 score of 87%, outperforming other existing models such as support vector machine (SVM), Residual network (ResNet50), EfficientNetB0, EfficientNetB4, and VGG16.

Keywords: deep learning - VGG16 - efficientNet - CNN – ensemble – dermoscopic images - melanoma

Procedia PDF Downloads 77
2201 Improved Classification Procedure for Imbalanced and Overlapped Situations

Authors: Hankyu Lee, Seoung Bum Kim

Abstract:

The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.

Keywords: classification, imbalanced data with class overlap, split data space, support vector machine

Procedia PDF Downloads 305
2200 Automated Feature Detection and Matching Algorithms for Breast IR Sequence Images

Authors: Chia-Yen Lee, Hao-Jen Wang, Jhih-Hao Lai

Abstract:

In recent years, infrared (IR) imaging has been considered as a potential tool to assess the efficacy of chemotherapy and early detection of breast cancer. Regions of tumor growth with high metabolic rate and angiogenesis phenomenon lead to the high temperatures. Observation of differences between the heat maps in long term is useful to help assess the growth of breast cancer cells and detect breast cancer earlier, wherein the multi-time infrared image alignment technology is a necessary step. Representative feature points detection and matching are essential steps toward the good performance of image registration and quantitative analysis. However, there is no clear boundary on the infrared images and the subject's posture are different for each shot. It cannot adhesive markers on a body surface for a very long period, and it is hard to find anatomic fiducial markers on a body surface. In other words, it’s difficult to detect and match features in an IR sequence images. In this study, automated feature detection and matching algorithms with two type of automatic feature points (i.e., vascular branch points and modified Harris corner) are developed respectively. The preliminary results show that the proposed method could identify the representative feature points on the IR breast images successfully of 98% accuracy and the matching results of 93% accuracy.

Keywords: Harris corner, infrared image, feature detection, registration, matching

Procedia PDF Downloads 299
2199 Nectariferous Plant Genetic Resources for Apicultural Entrepreneurship in Nigeria: Prerequisite for Conservation, Sustainable Management and Policy

Authors: C. V. Nnamani, O. L. Adedeji

Abstract:

The contemporary global economic meltdown has devastating effect on the Nigerian’s economy and its frantic search for alternative source of national revenue aside from oil and gas has become imperative for economic emancipation for Nigerians. Apicultural entrepreneurship could provide a source of livelihood if the basic knowledge of those plant genetic resources needed by bees is made available. A palynological evaluation of those palynotaxa which honey bees forage for pollen and nectar was carried out after standard acetolysis method. Results showed that the honey samples were highly diversified and rich in honey plants. A total of 9544.3 honey pollen, consisting of 39 honey plants belonging to 21 plant families and distributed within 38 genera were identified excluding 238 unidentified pollen grains. Data from the analysis equally revealed that Elaeis guineensis Jacq, Anacardium occidentale L, Diospyros mespiliformis Hochist xe ADC, Alchornea cordifolia Muell, Arg, Daniella oliveri (Rolfe) Hutch & Dalz, Irvingia wombolu Okafor ex Baill, Treculia africana Decne, Nauclea latifolia Smith and Crossopteryx febrifuga Afzil ex Benth were the predominant honey plants. It provided a guide to the optimal utilization of floral resources by honeybees in these regions, showing the opportunity and amazing potentials for apiculture entrepreneurship of these palytaxa. Most of these plants are rare, threatened and endangered. It calls for urgent conservation techniques and step by all players. Critical awareness creation to ensure farmers knowledge of these palynotaxa to ensure proper understanding and attendance boost from them as economic empowerment is needed.

Keywords: palynotaxa, acetolysis, enterprise, livelihood, Nigeria

Procedia PDF Downloads 288
2198 A Supervised Approach for Detection of Singleton Spam Reviews

Authors: Atefeh Heydari, Mohammadali Tavakoli, Naomie Salim

Abstract:

In recent years, we have witnessed that online reviews are the most important source of customers’ opinion. They are progressively more used by individuals and organisations to make purchase and business decisions. Unfortunately, for the reason of profit or fame, frauds produce deceptive reviews to hoodwink potential customers. Their activities mislead not only potential customers to make appropriate purchasing decisions and organisations to reshape their business, but also opinion mining techniques by preventing them from reaching accurate results. Spam reviews could be divided into two main groups, i.e. multiple and singleton spam reviews. Detecting a singleton spam review that is the only review written by a user ID is extremely challenging due to lack of clue for detection purposes. Singleton spam reviews are very harmful and various features and proofs used in multiple spam reviews detection are not applicable in this case. Current research aims to propose a novel supervised technique to detect singleton spam reviews. To achieve this, various features are proposed in this study and are to be combined with the most appropriate features extracted from literature and employed in a classifier. In order to compare the performance of different classifiers, SVM and naive Bayes classification algorithms were used for model building. The results revealed that SVM was more accurate than naive Bayes and our proposed technique is capable to detect singleton spam reviews effectively.

Keywords: classification algorithms, Naïve Bayes, opinion review spam detection, singleton review spam detection, support vector machine

Procedia PDF Downloads 304
2197 Commuters Trip Purpose Decision Tree Based Model of Makurdi Metropolis, Nigeria and Strategic Digital City Project

Authors: Emmanuel Okechukwu Nwafor, Folake Olubunmi Akintayo, Denis Alcides Rezende

Abstract:

Decision tree models are versatile and interpretable machine learning algorithms widely used for both classification and regression tasks, which can be related to cities, whether physical or digital. The aim of this research is to assess how well decision tree algorithms can predict trip purposes in Makurdi, Nigeria, while also exploring their connection to the strategic digital city initiative. The research methodology involves formalizing household demographic and trips information datasets obtained from extensive survey process. Modelling and Prediction were achieved using Python Programming Language and the evaluation metrics like R-squared and mean absolute error were used to assess the decision tree algorithm's performance. The results indicate that the model performed well, with accuracies of 84% and 68%, and low MAE values of 0.188 and 0.314, on training and validation data, respectively. This suggests the model can be relied upon for future prediction. The conclusion reiterates that This model will assist decision-makers, including urban planners, transportation engineers, government officials, and commuters, in making informed decisions on transportation planning and management within the framework of a strategic digital city. Its application will enhance the efficiency, sustainability, and overall quality of transportation services in Makurdi, Nigeria.

Keywords: decision tree algorithm, trip purpose, intelligent transport, strategic digital city, travel pattern, sustainable transport

Procedia PDF Downloads 11
2196 Optimal Pricing Based on Real Estate Demand Data

Authors: Vanessa Kummer, Maik Meusel

Abstract:

Real estate demand estimates are typically derived from transaction data. However, in regions with excess demand, transactions are driven by supply and therefore do not indicate what people are actually looking for. To estimate the demand for housing in Switzerland, search subscriptions from all important Swiss real estate platforms are used. These data do, however, suffer from missing information—for example, many users do not specify how many rooms they would like or what price they would be willing to pay. In economic analyses, it is often the case that only complete data is used. Usually, however, the proportion of complete data is rather small which leads to most information being neglected. Also, the data might have a strong distortion if it is complete. In addition, the reason that data is missing might itself also contain information, which is however ignored with that approach. An interesting issue is, therefore, if for economic analyses such as the one at hand, there is an added value by using the whole data set with the imputed missing values compared to using the usually small percentage of complete data (baseline). Also, it is interesting to see how different algorithms affect that result. The imputation of the missing data is done using unsupervised learning. Out of the numerous unsupervised learning approaches, the most common ones, such as clustering, principal component analysis, or neural networks techniques are applied. By training the model iteratively on the imputed data and, thereby, including the information of all data into the model, the distortion of the first training set—the complete data—vanishes. In a next step, the performances of the algorithms are measured. This is done by randomly creating missing values in subsets of the data, estimating those values with the relevant algorithms and several parameter combinations, and comparing the estimates to the actual data. After having found the optimal parameter set for each algorithm, the missing values are being imputed. Using the resulting data sets, the next step is to estimate the willingness to pay for real estate. This is done by fitting price distributions for real estate properties with certain characteristics, such as the region or the number of rooms. Based on these distributions, survival functions are computed to obtain the functional relationship between characteristics and selling probabilities. Comparing the survival functions shows that estimates which are based on imputed data sets do not differ significantly from each other; however, the demand estimate that is derived from the baseline data does. This indicates that the baseline data set does not include all available information and is therefore not representative for the entire sample. Also, demand estimates derived from the whole data set are much more accurate than the baseline estimation. Thus, in order to obtain optimal results, it is important to make use of all available data, even though it involves additional procedures such as data imputation.

Keywords: demand estimate, missing-data imputation, real estate, unsupervised learning

Procedia PDF Downloads 282
2195 Detection of MspI Polymorphism and SNP of GH Gene in Some Camel Breeds Reared in Egypt

Authors: Sekena H. Abd El-Aziem, Heba A. M. Abd El-Kader, Sally S. Alam, Othman E. Othman

Abstract:

Growth hormone (GH) is an anabolic hormone synthesized and secreted by the somatotroph cells of the anterior lobe of the pituitary gland in a circadian and pulsatile manner, the pattern of which plays an important role in postnatal longitudinal growth and development, tissue growth, lactation, reproduction as well as protein, lipid and carbohydrate metabolism. The aim of this study was to detect the genetic polymorphism of GH gene in five camel breeds reared in Egypt; Sudany, Somali, Mowaled, Maghrabi and Falahy, using PCR-RFLP technique. Also this work aimed to identify the single nucleotide polymorphism between different genotypes detected in these camel breeds. The amplified fragment of camel GH at 613-bp was digested with the restriction enzyme MspI and the result revealed the presence of three different genotypes; CC, CT and TT in tested breeds and significant differences were recorded in the genotype frequencies between these camel breeds. The result showed that the Maghrabi breed that is classified as a dual purpose camels had higher frequency for allele C (0.75) than those in the other tested four breeds. The sequence analysis declared the presence of a SNP (C→T) at position 264 in the amplified fragment which is responsible for the destruction of the restriction site C^CGG and consequently the appearance of two different alleles C and T. The nucleotide sequences of camel GH alleles T and C were submitted to nucleotide sequences database NCBI/Bankit/GenBank and have accession numbers: KP143517 and KP143518, respectively. It is concluded that only one SNP C→T was detected in GH gene among the five tested camel breeds reared in Egypt and this nucleotide substitution can be used as a marker for the genetic biodiversity between camel breeds reared in Egypt. Also, due to the possible association between allele C and higher growth rate, we can used it in MAS for camels and enter the camels possess this allele in breeding program as a way for enhancement of growth trait in camel breeds reared in Egypt.

Keywords: camel breeds in Egypt, GH, PCR-RFLP, SNPs

Procedia PDF Downloads 459
2194 Fuzzy Data, Random Drift, and a Theoretical Model for the Sequential Emergence of Religious Capacity in Genus Homo

Authors: Margaret Boone Rappaport, Christopher J. Corbally

Abstract:

The ancient ape ancestral population from which living great ape and human species evolved had demographic features affecting their evolution. The population was large, had great genetic variability, and natural selection was effective at honing adaptations. The emerging populations of chimpanzees and humans were affected more by founder effects and genetic drift because they were smaller. Natural selection did not disappear, but it was not as strong. Consequences of the 'population crash' and the human effective population size are introduced briefly. The history of the ancient apes is written in the genomes of living humans and great apes. The expansion of the brain began before the human line emerged. Coalescence times for some genes are very old – up to several million years, long before Homo sapiens. The mismatch between gene trees and species trees highlights the anthropoid speciation processes, and gives the human genome history a fuzzy, probabilistic quality. However, it suggests traits that might form a foundation for capacities emerging later. A theoretical model is presented in which the genomes of early ape populations provide the substructure for the emergence of religious capacity later on the human line. The model does not search for religion, but its foundations. It suggests a course by which an evolutionary line that began with prosimians eventually produced a human species with biologically based religious capacity. The model of the sequential emergence of religious capacity relies on cognitive science, neuroscience, paleoneurology, primate field studies, cognitive archaeology, genomics, and population genetics. And, it emphasizes five trait types: (1) Documented, positive selection of sensory capabilities on the human line may have favored survival, but also eventually enriched human religious experience. (2) The bonobo model suggests a possible down-regulation of aggression and increase in tolerance while feeding, as well as paedomorphism – but, in a human species that remains cognitively sharp (unlike the bonobo). The two species emerged from the same ancient ape population, so it is logical to search for shared traits. (3) An up-regulation of emotional sensitivity and compassion seems to have occurred on the human line. This finds support in modern genetic studies. (4) The authors’ published model of morality's emergence in Homo erectus encompasses a cognitively based, decision-making capacity that was hypothetically overtaken, in part, by religious capacity. Together, they produced a strong, variable, biocultural capability to support human sociability. (5) The full flowering of human religious capacity came with the parietal expansion and smaller face (klinorhynchy) found only in Homo sapiens. Details from paleoneurology suggest the stage was set for human theologies. Larger parietal lobes allowed humans to imagine inner spaces, processes, and beings, and, with the frontal lobe, led to the first theologies composed of structured and integrated theories of the relationships between humans and the supernatural. The model leads to the evolution of a small population of African hominins that was ready to emerge with religious capacity when the species Homo sapiens evolved two hundred thousand years ago. By 50-60,000 years ago, when human ancestors left Africa, they were fully enabled.

Keywords: genetic drift, genomics, parietal expansion, religious capacity

Procedia PDF Downloads 335
2193 The Classification Accuracy of Finance Data through Holder Functions

Authors: Yeliz Karaca, Carlo Cattani

Abstract:

This study focuses on the local Holder exponent as a measure of the function regularity for time series related to finance data. In this study, the attributes of the finance dataset belonging to 13 countries (India, China, Japan, Sweden, France, Germany, Italy, Australia, Mexico, United Kingdom, Argentina, Brazil, USA) located in 5 different continents (Asia, Europe, Australia, North America and South America) have been examined.These countries are the ones mostly affected by the attributes with regard to financial development, covering a period from 2012 to 2017. Our study is concerned with the most important attributes that have impact on the development of finance for the countries identified. Our method is comprised of the following stages: (a) among the multi fractal methods and Brownian motion Holder regularity functions (polynomial, exponential), significant and self-similar attributes have been identified (b) The significant and self-similar attributes have been applied to the Artificial Neuronal Network (ANN) algorithms (Feed Forward Back Propagation (FFBP) and Cascade Forward Back Propagation (CFBP)) (c) the outcomes of classification accuracy have been compared concerning the attributes that have impact on the attributes which affect the countries’ financial development. This study has enabled to reveal, through the application of ANN algorithms, how the most significant attributes are identified within the relevant dataset via the Holder functions (polynomial and exponential function).

Keywords: artificial neural networks, finance data, Holder regularity, multifractals

Procedia PDF Downloads 242
2192 Nondestructive Prediction and Classification of Gel Strength in Ethanol-Treated Kudzu Starch Gels Using Near-Infrared Spectroscopy

Authors: John-Nelson Ekumah, Selorm Yao-Say Solomon Adade, Mingming Zhong, Yufan Sun, Qiufang Liang, Muhammad Safiullah Virk, Xorlali Nunekpeku, Nana Adwoa Nkuma Johnson, Bridget Ama Kwadzokpui, Xiaofeng Ren

Abstract:

Enhancing starch gel strength and stability is crucial. However, traditional gel property assessment methods are destructive, time-consuming, and resource-intensive. Thus, understanding ethanol treatment effects on kudzu starch gel strength and developing a rapid, nondestructive gel strength assessment method is essential for optimizing the treatment process and ensuring product quality consistency. This study investigated the effects of different ethanol concentrations on the microstructure of kudzu starch gels using a comprehensive microstructural analysis. We also developed a nondestructive method for predicting gel strength and classifying treatment levels using near-infrared (NIR) spectroscopy, and advanced data analytics. Scanning electron microscopy revealed progressive network densification and pore collapse with increasing ethanol concentration, correlating with enhanced mechanical properties. NIR spectroscopy, combined with various variable selection methods (CARS, GA, and UVE) and modeling algorithms (PLS, SVM, and ELM), was employed to develop predictive models for gel strength. The UVE-SVM model demonstrated exceptional performance, with the highest R² values (Rc = 0.9786, Rp = 0.9688) and lowest error rates (RMSEC = 6.1340, RMSEP = 6.0283). Pattern recognition algorithms (PCA, LDA, and KNN) successfully classified gels based on ethanol treatment levels, achieving near-perfect accuracy. This integrated approach provided a multiscale perspective on ethanol-induced starch gel modification, from molecular interactions to macroscopic properties. Our findings demonstrate the potential of NIR spectroscopy, coupled with advanced data analysis, as a powerful tool for rapid, nondestructive quality assessment in starch gel production. This study contributes significantly to the understanding of starch modification processes and opens new avenues for research and industrial applications in food science, pharmaceuticals, and biomaterials.

Keywords: kudzu starch gel, near-infrared spectroscopy, gel strength prediction, support vector machine, pattern recognition algorithms, ethanol treatment

Procedia PDF Downloads 30
2191 General Architecture for Automation of Machine Learning Practices

Authors: U. Borasi, Amit Kr. Jain, Rakesh, Piyush Jain

Abstract:

Data collection, data preparation, model training, model evaluation, and deployment are all processes in a typical machine learning workflow. Training data needs to be gathered and organised. This often entails collecting a sizable dataset and cleaning it to remove or correct any inaccurate or missing information. Preparing the data for use in the machine learning model requires pre-processing it after it has been acquired. This often entails actions like scaling or normalising the data, handling outliers, selecting appropriate features, reducing dimensionality, etc. This pre-processed data is then used to train a model on some machine learning algorithm. After the model has been trained, it needs to be assessed by determining metrics like accuracy, precision, and recall, utilising a test dataset. Every time a new model is built, both data pre-processing and model training—two crucial processes in the Machine learning (ML) workflow—must be carried out. Thus, there are various Machine Learning algorithms that can be employed for every single approach to data pre-processing, generating a large set of combinations to choose from. Example: for every method to handle missing values (dropping records, replacing with mean, etc.), for every scaling technique, and for every combination of features selected, a different algorithm can be used. As a result, in order to get the optimum outcomes, these tasks are frequently repeated in different combinations. This paper suggests a simple architecture for organizing this largely produced “combination set of pre-processing steps and algorithms” into an automated workflow which simplifies the task of carrying out all possibilities.

Keywords: machine learning, automation, AUTOML, architecture, operator pool, configuration, scheduler

Procedia PDF Downloads 52
2190 Rank-Based Chain-Mode Ensemble for Binary Classification

Authors: Chongya Song, Kang Yen, Alexander Pons, Jin Liu

Abstract:

In the field of machine learning, the ensemble has been employed as a common methodology to improve the performance upon multiple base classifiers. However, the true predictions are often canceled out by the false ones during consensus due to a phenomenon called “curse of correlation” which is represented as the strong interferences among the predictions produced by the base classifiers. In addition, the existing practices are still not able to effectively mitigate the problem of imbalanced classification. Based on the analysis on our experiment results, we conclude that the two problems are caused by some inherent deficiencies in the approach of consensus. Therefore, we create an enhanced ensemble algorithm which adopts a designed rank-based chain-mode consensus to overcome the two problems. In order to evaluate the proposed ensemble algorithm, we employ a well-known benchmark data set NSL-KDD (the improved version of dataset KDDCup99 produced by University of New Brunswick) to make comparisons between the proposed and 8 common ensemble algorithms. Particularly, each compared ensemble classifier uses the same 22 base classifiers, so that the differences in terms of the improvements toward the accuracy and reliability upon the base classifiers can be truly revealed. As a result, the proposed rank-based chain-mode consensus is proved to be a more effective ensemble solution than the traditional consensus approach, which outperforms the 8 ensemble algorithms by 20% on almost all compared metrices which include accuracy, precision, recall, F1-score and area under receiver operating characteristic curve.

Keywords: consensus, curse of correlation, imbalance classification, rank-based chain-mode ensemble

Procedia PDF Downloads 133
2189 The Influence of the Aquatic Environment on Hematological Parameters in Cyprinus carpio

Authors: Andreea D. Șerban, Răzvan Mălăncuș, Mihaela Ivancia, Șteofil Creangă

Abstract:

Just as air influences the quality of life in the terrestrial environment, water, as a living environment, is one of great importance when it comes to the quality of life of underwater animals, which acquires an even higher degree of importance when analyzing underwater creatures as future products for human consumption. Thus, going beyond the ideal environment, in which all water quality parameters are permanently in perfect standards for reproduction, growth, and development of fish material and customizing this study to reality, it was demonstrated the importance of reproduction, development, and growth of biological material, necessary in the population fish farms, in the same environment to gain the maximum yield that a fish farm can offer. The biological material used was harvested from 3 fish farms located at great distances from each other to have environments with different parameters. The specimens were clinically healthy at 2 years of age. Thus, the differences in water quality parameters had effects on specimens from other environments, describing large curves in their evolution in new environments. Another change was observed in the new environment, the specimens contributing with the "genetic package" to its modification, tending to a balance of the parameters studied to the values in the environment in which they lived until the time of the experiment. The study clearly showed that adaptability to the environment in which an individual has developed and grown is not valid in environments with different parameters, resulting even in the fatality of one sample during the experiment. In some specimens, the values of the studied hematological parameters were halved after the transfer to the new environment, and in others, the same parameters were doubled. The study concludes that the specimens were adapted to the environment in which they developed and grew, their descendants having a higher value of heritability only in the initial environment. It is known that heritability is influenced 50% by the genetic package of the individual and 50% by the environment, by removing the value of the environment, the duration of improvement of characters of interest will be shorter and the maximum yield of fish farms can be achieved in a smaller period.

Keywords: environment, heritability, quality, water

Procedia PDF Downloads 165
2188 Unlocking Health Insights: Studying Data for Better Care

Authors: Valentina Marutyan

Abstract:

Healthcare data mining is a rapidly developing field at the intersection of technology and medicine that has the potential to change our understanding and approach to providing healthcare. Healthcare and data mining is the process of examining huge amounts of data to extract useful information that can be applied in order to improve patient care, treatment effectiveness, and overall healthcare delivery. This field looks for patterns, trends, and correlations in a variety of healthcare datasets, such as electronic health records (EHRs), medical imaging, patient demographics, and treatment histories. To accomplish this, it uses advanced analytical approaches. Predictive analysis using historical patient data is a major area of interest in healthcare data mining. This enables doctors to get involved early to prevent problems or improve results for patients. It also assists in early disease detection and customized treatment planning for every person. Doctors can customize a patient's care by looking at their medical history, genetic profile, current and previous therapies. In this way, treatments can be more effective and have fewer negative consequences. Moreover, helping patients, it improves the efficiency of hospitals. It helps them determine the number of beds or doctors they require in regard to the number of patients they expect. In this project are used models like logistic regression, random forests, and neural networks for predicting diseases and analyzing medical images. Patients were helped by algorithms such as k-means, and connections between treatments and patient responses were identified by association rule mining. Time series techniques helped in resource management by predicting patient admissions. These methods improved healthcare decision-making and personalized treatment. Also, healthcare data mining must deal with difficulties such as bad data quality, privacy challenges, managing large and complicated datasets, ensuring the reliability of models, managing biases, limited data sharing, and regulatory compliance. Finally, secret code of data mining in healthcare helps medical professionals and hospitals make better decisions, treat patients more efficiently, and work more efficiently. It ultimately comes down to using data to improve treatment, make better choices, and simplify hospital operations for all patients.

Keywords: data mining, healthcare, big data, large amounts of data

Procedia PDF Downloads 72
2187 Gene Distribution of CB1 Receptor rs2023239 in Thailand Cannabis Patients

Authors: Tanyaporn Chairoch

Abstract:

Introduction: Cannabis is a drug to treat patients with many diseases such as Multiple sclerosis, Alzheimer’s disease, and Epilepsy, where theycontain many active compounds such as delta-9 tetrahydrocannabinol (THC) and cannabidiol (CBD). Especially, THC is the primary psychoactive ingredient in cannabis and binds to cannabinoid 1 (CB1) receptors. Moreover, CB1 is located on the neocortex, hippocampus, basal ganglia, cerebellum, and brainstem. In previous study, we found the association between the variant of CB1recptors gene (rs2023239) and decreased effect of nicotine reinforcement in patients. However, there are no data describing whether the distribution of CB1 receptor gene is a genetic marker for Thai patients who are treated with cannabis. Objective: Thus, the aim of this study we want to investigate the frequency of the CB1 receptor gene in Thai patients. Materials and Methods: All of sixty Thai patients received the medical cannabis for treatment who were recruited in this study. DNA will be extracted from EDTA whole blood by Genomic DNA Mini Kit. The genotyping of CNR1 gene (rs 2023239) was genotyped by the TaqMan real time PCR assay (ABI, Foster City, CA, USA).and using the real-time PCR ViiA7 (ABI, Foster City, CA, USA). Results: We found thirty-eight (63.3%) Thai patients were female, and twenty-two (36.70%) were male in this study with median age of 45.8 (range19 – 87 ) years. Especially, thirty-two (53.30%) medical cannabis tolerant controls were female ( 55%) and median age of52.1 (range 27 – 79 ) years. The most adverse effects for medical cannabis treatment was tachycardia. Furthermore, the number of rs 2023239 (TT) carriers was 26 of 27 (96.29%) in medical cannabis-induced adverse effects and 32 of 33 (96.96%) in tolerant controls. Additionally, rs 2023239 (CT) variant was found just only one of twenty-seven (3.7%) in medical cannabis-induced adverse effects and 1 of 33 (3.03%) in tolerant controls. Conclusions: The distribution of genetic variant in CNR1 gene might serve as a pharmacogenetics markers for screening before initiating the therapy with medical cannabis in Thai patients.

Keywords: cannabis, pharmacogenetics, CNR1 gene, thai patient

Procedia PDF Downloads 105
2186 Oil-Oil Correlation Using Polar and Non-Polar Fractions of Crude Oil: A Case Study in Iranian Oil Fields

Authors: Morteza Taherinezhad, Ahmad Reza Rabbani, Morteza Asemani, Rudy Swennen

Abstract:

Oil-oil correlation is one of the most important issues in geochemical studies that enables to classify oils genetically. Oil-oil correlation is generally estimated based on non-polar fractions of crude oil (e.g., saturate and aromatic compounds). Despite several advantages, the drawback of using these compounds is their susceptibility of being affected by secondary processes. The polar fraction of crude oil (e.g., asphaltenes) has similar characteristics to kerogen, and this structural similarity is preserved during migration, thermal maturation, biodegradation, and water washing. Therefore, these structural characteristics can be considered as a useful correlation parameter, and it can be concluded that asphaltenes from different reservoirs with the same genetic signatures have a similar origin. Hence in this contribution, an integrated study by using both non-polar and polar fractions of oil was performed to use the merits of both fractions. Therefore, five oil samples from oil fields in the Persian Gulf were studied. Structural characteristics of extracted asphaltenes were investigated by Fourier transform infrared (FTIR) spectroscopy. Graphs based on aliphatic and aromatic compounds (predominant compounds in asphaltenes structure) and sulphoxide and carbonyl functional groups (which are representatives of sulphur and oxygen abundance in asphaltenes) were used for comparison of asphaltenes structures in different samples. Non-polar fractions were analyzed by GC-MS. The study of asphaltenes showed the studied oil samples comprise two oil families with distinct genetic characteristics. The first oil family consists of Salman and Reshadat oil samples, and the second oil family consists of Resalat, Siri E, and Siri D oil samples. To validate our results, biomarker parameters were employed, and this approach completely confirmed previous results. Based on biomarker analyses, both oil families have a marine source rock, whereby marl and carbonate source rocks are the source rock for the first and the second oil family, respectively.

Keywords: biomarker, non-polar fraction, oil-oil correlation, petroleum geochemistry, polar fraction

Procedia PDF Downloads 130
2185 Bias Prevention in Automated Diagnosis of Melanoma: Augmentation of a Convolutional Neural Network Classifier

Authors: Kemka Ihemelandu, Chukwuemeka Ihemelandu

Abstract:

Melanoma remains a public health crisis, with incidence rates increasing rapidly in the past decades. Improving diagnostic accuracy to decrease misdiagnosis using Artificial intelligence (AI) continues to be documented. Unfortunately, unintended racially biased outcomes, a product of lack of diversity in the dataset used, with a noted class imbalance favoring lighter vs. darker skin tone, have increasingly been recognized as a problem.Resulting in noted limitations of the accuracy of the Convolutional neural network (CNN)models. CNN models are prone to biased output due to biases in the dataset used to train them. Our aim in this study was the optimization of convolutional neural network algorithms to mitigate bias in the automated diagnosis of melanoma. We hypothesized that our proposed training algorithms based on a data augmentation method to optimize the diagnostic accuracy of a CNN classifier by generating new training samples from the original ones will reduce bias in the automated diagnosis of melanoma. We applied geometric transformation, including; rotations, translations, scale change, flipping, and shearing. Resulting in a CNN model that provided a modifiedinput data making for a model that could learn subtle racial features. Optimal selection of the momentum and batch hyperparameter increased our model accuracy. We show that our augmented model reduces bias while maintaining accuracy in the automated diagnosis of melanoma.

Keywords: bias, augmentation, melanoma, convolutional neural network

Procedia PDF Downloads 205
2184 Genetic Diversity of Termite (Isoptera) Fauna of Western Ghats of India

Authors: A. S. Vidyashree, C. M. Kalleshwaraswamy, R. Asokan, H. M. Mahadevaswamy

Abstract:

Termites are very vital ecological thespians in tropical ecosystem, having been designated as “ecosystem engineers”, due to their significant role in providing soil ecosystem services. Despite their importance, our understanding of a number of their basic biological processes in termites is extremely limited. Developing a better understanding of termite biology is closely dependent upon consistent species identification. At present, identification of termites is relied on soldier castes. But for many species, soldier caste is not reported, that creates confusion in identification. The use of molecular markers may be helpful in estimating phylogenetic relatedness between the termite species and estimating genetic differentiation among local populations within each species. To understand this, termites samples were collected from various places of Western Ghats covering four states namely Karnataka, Kerala, Tamil Nadu, Maharashtra during 2013-15. Termite samples were identified based on their morphological characteristics, molecular characteristics, or both. Survey on the termite fauna in Karnataka, Kerala, Maharashtra and Tamil Nadu indicated the presence of a 16 species belongs to 4 subfamilies under two families viz., Rhinotermitidae and Termitidae. Termititidae was the dominant family which was belonging to 4 genera and four subfamilies viz., Macrotermitinae, Amitermitinae, Nasutitermitinae and Termitinae. Amitermitinae had three species namely, Microcerotermes fletcheri, M. pakistanicus and Speculitermes sinhalensis. Macrotermitinae had the highest number of species belonging two genera, namely Microtermes and Odontotermes. Microtermes genus was with only one species i.e., Microtermes obesi. The genus Odontotermes was represented by the highest number of species (07), namely, O. obesus was the dominant (41 per cent) and the most widely distributed species in Karnataka, Karala, Maharashtra and Tamil nadu followed by O. feae (19 per cent), O.assmuthi (11 per cent) and others like O. bellahunisensis O. horni O. redemanni, O. yadevi. Nasutitermitinae was represented by two genera namely Nasutitermes anamalaiensis and Trinervitermes biformis. Termitinae subfamily was represented by Labiocapritermes distortus. Rhinotermitidae was represented by single subfamily Heterotermetinae. In Heterotermetinae, two species namely Heterotermes balwanthi and H. malabaricus were recorded. Genetic relationship among termites collected from various locations of Western Ghats of India was characterized based on mitochondrial DNA sequences (12S, 16S, and COII). Sequence analysis and divergence among the species was assessed. These results suggest that the use of both molecular and morphological approaches is crucial in ensuring accurate species identification. Efforts were made to understand their evolution and to address the ambiguities in morphological taxonomy. The implication of the study in revising the taxonomy of Indian termites, their characterization and molecular comparisons between the sequences are discussed.

Keywords: isoptera, mitochondrial DNA sequences, rhinotermitidae, termitidae, Western ghats

Procedia PDF Downloads 261
2183 An Adiabatic Quantum Optimization Approach for the Mixed Integer Nonlinear Programming Problem

Authors: Maxwell Henderson, Tristan Cook, Justin Chan Jin Le, Mark Hodson, YoungJung Chang, John Novak, Daniel Padilha, Nishan Kulatilaka, Ansu Bagchi, Sanjoy Ray, John Kelly

Abstract:

We present a method of using adiabatic quantum optimization (AQO) to solve a mixed integer nonlinear programming (MINLP) problem instance. The MINLP problem is a general form of a set of NP-hard optimization problems that are critical to many business applications. It requires optimizing a set of discrete and continuous variables with nonlinear and potentially nonconvex constraints. Obtaining an exact, optimal solution for MINLP problem instances of non-trivial size using classical computation methods is currently intractable. Current leading algorithms leverage heuristic and divide-and-conquer methods to determine approximate solutions. Creating more accurate and efficient algorithms is an active area of research. Quantum computing (QC) has several theoretical benefits compared to classical computing, through which QC algorithms could obtain MINLP solutions that are superior to current algorithms. AQO is a particular form of QC that could offer more near-term benefits compared to other forms of QC, as hardware development is in a more mature state and devices are currently commercially available from D-Wave Systems Inc. It is also designed for optimization problems: it uses an effect called quantum tunneling to explore all lowest points of an energy landscape where classical approaches could become stuck in local minima. Our work used a novel algorithm formulated for AQO to solve a special type of MINLP problem. The research focused on determining: 1) if the problem is possible to solve using AQO, 2) if it can be solved by current hardware, 3) what the currently achievable performance is, 4) what the performance will be on projected future hardware, and 5) when AQO is likely to provide a benefit over classical computing methods. Two different methods, integer range and 1-hot encoding, were investigated for transforming the MINLP problem instance constraints into a mathematical structure that can be embedded directly onto the current D-Wave architecture. For testing and validation a D-Wave 2X device was used, as well as QxBranch’s QxLib software library, which includes a QC simulator based on simulated annealing. Our results indicate that it is mathematically possible to formulate the MINLP problem for AQO, but that currently available hardware is unable to solve problems of useful size. Classical general-purpose simulated annealing is currently able to solve larger problem sizes, but does not scale well and such methods would likely be outperformed in the future by improved AQO hardware with higher qubit connectivity and lower temperatures. If larger AQO devices are able to show improvements that trend in this direction, commercially viable solutions to the MINLP for particular applications could be implemented on hardware projected to be available in 5-10 years. Continued investigation into optimal AQO hardware architectures and novel methods for embedding MINLP problem constraints on to those architectures is needed to realize those commercial benefits.

Keywords: adiabatic quantum optimization, mixed integer nonlinear programming, quantum computing, NP-hard

Procedia PDF Downloads 523
2182 Decision Support: How Explainable A.I. Can Improve Transparency and Trust with Human Users

Authors: Devon Brown, Liu Chunmei

Abstract:

This paper will present an analysis as part of the researchers dissertation topic focusing on the intersection of affective and analytical directed acyclic graphs (DAGs) in the context of Decision Support Systems (DSS). The researcher’s work involves analyzing decision theory models like Affective and Bayesian Decision theory models and how they could be implemented under an Affective Computing Framework using Information Fusion and Human-Centered Design. Additionally, the researcher is beginning research on an Affective-Analytic Decision Framework (AADF) model for their dissertation research and are looking to merge logic and analytic models with empathetic insights into affective DAGs. Data-collection efforts begin Fall 2024 and in preparation for the efforts this paper looks to analyze previous research in this area and introduce the AADF framework and propose conceptual models for consideration. For this paper, the research emphasis is placed on analyzing Bayesian networks and Markov models which offer probabilistic techniques during uncertainty in decision-making. Ideally, including affect into analytic models will ensure algorithms can increase user trust with algorithms by including emotional states and the user’s experience with the goal of developing emotionally intelligent A.I. systems that can start to navigate the complex fabric of human emotion during decision-making.

Keywords: decision support systems, explainable AI, HCAI techniques, affective-analytical decision framework

Procedia PDF Downloads 15
2181 Genetic Polymorphism of Milk Protein Gene and Association with Milk Production Traits in Local Latvian Brown Breed Cows

Authors: Daina Jonkus, Solvita Petrovska, Dace Smiltina, Lasma Cielava

Abstract:

The beta-lactoglobulin and kappa-casein are milk proteins which are important for milk composition. Cows with beta-lactoglobulin and kappa-casein gene BB genotypes have highest milk crude protein and fat content. The aim of the study was to determinate the frequencies of milk protein gene polymorphisms in local Latvian Brown (LB) cows breed and analyze the influence of beta-lactoglobulin and kappa-casein genotypes to milk productivity traits. 102 cows’ genotypes of milk protein genes were detected using Polymerase Chain Reaction and Restriction Fragment Length Polymorphism (PCR-RFLP) and electrophoresis on 3% agarose gel. For beta-lactoglobulin were observed 2 types of alleles A and B and for kappa-casein 3 types: A, B and E. Highest frequency in beta-lactoglobulin gene was observed for B allele – 0.926. Molecular analysis of beta-lactoglobulin gene shows 86.3% of individuals are homozygous by B allele and animals are with genotypes BB and 12.7% of individuals are heterozygous with genotypes AB. The highest milk yield 4711.7 kg was for 1st lactation cows with AB genotypes, whereas the highest milk protein content (3.35%) and fat content (4.46 %) was for BB genotypes. Analysis of the kappa-casein locus showed a prevalence of the A allele – 0.750. The genetic variant of B was characterized by a low frequency – 0.240. Moreover, the frequency of E occurred in the LB cows’ population with very low frequency – 0.010. 54.9 % of cows are homozygous with genotypes AA, and only 4.9 % are homozygous with genotypes BB. 32.8 % of individuals are heterozygous with genotypes AB, and 2.0 % are with AE. The highest milk productivity was for 1st lactation cows with AB genotypes: milk yield 4620.3 kg, milk protein content 3.39% and fat content 4.53 %. According to the results, in local Latvian brown there are only 2.9% of cows are with BB-BB genotypes, which is related to milk coagulation ability and affected cheese production yield. Acknowledgment: the investigation is supported by VPP 2014-2017 AgroBioRes Project No. 3 LIVESTOCK.

Keywords: beta-lactoglobulin, cows, genotype frequencies, kappa-casein

Procedia PDF Downloads 268
2180 Genome-Wide Homozygosity Analysis of the Longevous Phenotype in the Amish Population

Authors: Sandra Smieszek, Jonathan Haines

Abstract:

Introduction: Numerous research efforts have focused on searching for ‘longevity genes’. However, attempting to decipher the genetic component of the longevous phenotype have resulted in limited success and the mechanisms governing longevity remain to be explained. We conducted a genome-wide homozygosity analysis (GWHA) of the founder population of the Amish community in central Ohio. While genome-wide association studies using unrelated individuals have revealed many interesting longevity associated variants, these variants are typically of small effect and cannot explain the observed patterns of heritability for this complex trait. The Amish provide a large cohort of extended kinships allowing for in depth analysis via family-based approach excellent population due to its. Heritability of longevity increases with age with significant genetic contribution being seen in individuals living beyond 60 years of age. In our present analysis we show that the heritability of longevity is estimated to be increasing with age particularly on the paternal side. Methods: The present analysis integrated both phenotypic and genotypic data and led to the discovery of a series of variants, distinct for stratified populations across ages and distinct for paternal and maternal cohorts. Specifically 5437 subjects were analyzed and a subset of 893 successfully genotyped individuals was used to assess CHIP heritability. We have conducted the homozygosity analysis to examine if homozygosity is associated with increased risk of living beyond 90. We analyzed AMISH cohort genotyped for 614,957 SNPs. Results: We delineated 10 significant regions of homozygosity (ROH) specific for the age group of interest (>90). Of particular interest was ROH on chromosome 13, P < 0.0001. The lead SNPs rs7318486 and rs9645914 point to COL4A2 and our lead SNP. COL25A1 encodes one of the six subunits of type IV collagen, the C-terminal portion of the protein, known as canstatin, is an inhibitor of angiogenesis and tumor growth. COL4A2 mutations have been reported with a broader spectrum of cerebrovascular, renal, ophthalmological, cardiac, and muscular abnormalities. The second region of interest points to IRS2. Furthermore we built a classifier using the obtained SNPs from the significant ROH region with 0.945 AUC giving ability to discriminate between those living beyond to 90 years of age and beyond. Conclusion: In conclusion our results suggest that a history of longevity does indeed contribute to increasing the odds of individual longevity. Preliminary results are consistent with conjecture that heritability of longevity is substantial when we start looking at oldest fifth and smaller percentiles of survival specifically in males. We will validate all the candidate variants in independent cohorts of centenarians, to test whether they are robustly associated with human longevity. The identified regions of interest via ROH analysis could be of profound importance for the understanding of genetic underpinnings of longevity.

Keywords: regions of homozygosity, longevity, SNP, Amish

Procedia PDF Downloads 231
2179 Interpretation of the Russia-Ukraine 2022 War via N-Gram Analysis

Authors: Elcin Timur Cakmak, Ayse Oguzlar

Abstract:

This study presents the results of the tweets sent by Twitter users on social media about the Russia-Ukraine war by bigram and trigram methods. On February 24, 2022, Russian President Vladimir Putin declared a military operation against Ukraine, and all eyes were turned to this war. Many people living in Russia and Ukraine reacted to this war and protested and also expressed their deep concern about this war as they felt the safety of their families and their futures were at stake. Most people, especially those living in Russia and Ukraine, express their views on the war in different ways. The most popular way to do this is through social media. Many people prefer to convey their feelings using Twitter, one of the most frequently used social media tools. Since the beginning of the war, it is seen that there have been thousands of tweets about the war from many countries of the world on Twitter. These tweets accumulated in data sources are extracted using various codes for analysis through Twitter API and analysed by Python programming language. The aim of the study is to find the word sequences in these tweets by the n-gram method, which is known for its widespread use in computational linguistics and natural language processing. The tweet language used in the study is English. The data set consists of the data obtained from Twitter between February 24, 2022, and April 24, 2022. The tweets obtained from Twitter using the #ukraine, #russia, #war, #putin, #zelensky hashtags together were captured as raw data, and the remaining tweets were included in the analysis stage after they were cleaned through the preprocessing stage. In the data analysis part, the sentiments are found to present what people send as a message about the war on Twitter. Regarding this, negative messages make up the majority of all the tweets as a ratio of %63,6. Furthermore, the most frequently used bigram and trigram word groups are found. Regarding the results, the most frequently used word groups are “he, is”, “I, do”, “I, am” for bigrams. Also, the most frequently used word groups are “I, do, not”, “I, am, not”, “I, can, not” for trigrams. In the machine learning phase, the accuracy of classifications is measured by Classification and Regression Trees (CART) and Naïve Bayes (NB) algorithms. The algorithms are used separately for bigrams and trigrams. We gained the highest accuracy and F-measure values by the NB algorithm and the highest precision and recall values by the CART algorithm for bigrams. On the other hand, the highest values for accuracy, precision, and F-measure values are achieved by the CART algorithm, and the highest value for the recall is gained by NB for trigrams.

Keywords: classification algorithms, machine learning, sentiment analysis, Twitter

Procedia PDF Downloads 71
2178 Tomato-Weed Classification by RetinaNet One-Step Neural Network

Authors: Dionisio Andujar, Juan lópez-Correa, Hugo Moreno, Angela Ri

Abstract:

The increased number of weeds in tomato crops highly lower yields. Weed identification with the aim of machine learning is important to carry out site-specific control. The last advances in computer vision are a powerful tool to face the problem. The analysis of RGB (Red, Green, Blue) images through Artificial Neural Networks had been rapidly developed in the past few years, providing new methods for weed classification. The development of the algorithms for crop and weed species classification looks for a real-time classification system using Object Detection algorithms based on Convolutional Neural Networks. The site study was located in commercial corn fields. The classification system has been tested. The procedure can detect and classify weed seedlings in tomato fields. The input to the Neural Network was a set of 10,000 RGB images with a natural infestation of Cyperus rotundus l., Echinochloa crus galli L., Setaria italica L., Portulaca oeracea L., and Solanum nigrum L. The validation process was done with a random selection of RGB images containing the aforementioned species. The mean average precision (mAP) was established as the metric for object detection. The results showed agreements higher than 95 %. The system will provide the input for an online spraying system. Thus, this work plays an important role in Site Specific Weed Management by reducing herbicide use in a single step.

Keywords: deep learning, object detection, cnn, tomato, weeds

Procedia PDF Downloads 101
2177 Comparative Study and Parallel Implementation of Stochastic Models for Pricing of European Options Portfolios using Monte Carlo Methods

Authors: Vinayak Bassi, Rajpreet Singh

Abstract:

Over the years, with the emergence of sophisticated computers and algorithms, finance has been quantified using computational prowess. Asset valuation has been one of the key components of quantitative finance. In fact, it has become one of the embryonic steps in determining risk related to a portfolio, the main goal of quantitative finance. This study comprises a drawing comparison between valuation output generated by two stochastic dynamic models, namely Black-Scholes and Dupire’s bi-dimensionality model. Both of these models are formulated for computing the valuation function for a portfolio of European options using Monte Carlo simulation methods. Although Monte Carlo algorithms have a slower convergence rate than calculus-based simulation techniques (like FDM), they work quite effectively over high-dimensional dynamic models. A fidelity gap is analyzed between the static (historical) and stochastic inputs for a sample portfolio of underlying assets. In order to enhance the performance efficiency of the model, the study emphasized the use of variable reduction methods and customizing random number generators to implement parallelization. An attempt has been made to further implement the Dupire’s model on a GPU to achieve higher computational performance. Furthermore, ideas have been discussed around the performance enhancement and bottleneck identification related to the implementation of options-pricing models on GPUs.

Keywords: monte carlo, stochastic models, computational finance, parallel programming, scientific computing

Procedia PDF Downloads 157