Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 4049

Search results for: RNA-related genomic features

4049 TARF: Web Toolkit for Annotating RNA-Related Genomic Features

Abstract:

Genomic features, the genome-based coordinates, are commonly used for the representation of biological features such as genes, RNA transcripts and transcription factor binding sites. For the analysis of RNA-related genomic features, such as RNA modification sites, a common task is to correlate these features with transcript components (5'UTR, CDS, 3'UTR) to explore their distribution characteristics in terms of transcriptomic coordinates, e.g., to examine whether a specific type of biological feature is enriched near transcription start sites. Existing approaches for performing these tasks involve the manipulation of a gene database, conversion from genome-based coordinate to transcript-based coordinate, and visualization methods that are capable of showing RNA transcript components and distribution of the features. These steps are complicated and time consuming, and this is especially true for researchers who are not familiar with relevant tools. To overcome this obstacle, we develop a dedicated web app TARF, which represents web toolkit for annotating RNA-related genomic features. TARF web tool intends to provide a web-based way to easily annotate and visualize RNA-related genomic features. Once a user has uploaded the features with BED format and specified a built-in transcript database or uploaded a customized gene database with GTF format, the tool could fulfill its three main functions. First, it adds annotation on gene and RNA transcript components. For every features provided by the user, the overlapping with RNA transcript components are identified, and the information is combined in one table which is available for copy and download. Summary statistics about ambiguous belongings are also carried out. Second, the tool provides a convenient visualization method of the features on single gene/transcript level. For the selected gene, the tool shows the features with gene model on genome-based view, and also maps the features to transcript-based coordinate and show the distribution against one single spliced RNA transcript. Third, a global transcriptomic view of the genomic features is generated utilizing the Guitar R/Bioconductor package. The distribution of features on RNA transcripts are normalized with respect to RNA transcript landmarks and the enrichment of the features on different RNA transcript components is demonstrated. We tested the newly developed TARF toolkit with 3 different types of genomics features related to chromatin H3K4me3, RNA N6-methyladenosine (m6A) and RNA 5-methylcytosine (m5C), which are obtained from ChIP-Seq, MeRIP-Seq and RNA BS-Seq data, respectively. TARF successfully revealed their respective distribution characteristics, i.e. H3K4me3, m6A and m5C are enriched near transcription starting sites, stop codons and 5’UTRs, respectively. Overall, TARF is a useful web toolkit for annotation and visualization of RNA-related genomic features, and should help simplify the analysis of various RNA-related genomic features, especially those related RNA modifications.

Keywords: RNA-related genomic features, annotation, visualization, web server

Procedia PDF Downloads 202

4048 The Various Legal Dimensions of Genomic Data

Authors: Amy Gooden

Abstract:

When human genomic data is considered, this is often done through only one dimension of the law, or the interplay between the various dimensions is not considered, thus providing an incomplete picture of the legal framework. This research considers and analyzes the various dimensions in South African law applicable to genomic sequence data – including property rights, personality rights, and intellectual property rights. The effective use of personal genomic sequence data requires the acknowledgement and harmonization of the rights applicable to such data.

Keywords: artificial intelligence, data, law, genomics, rights

Procedia PDF Downloads 135

4047 Genomic Evidence for Ancient Human Migrations Along South America's East Coast

Authors: Andre Luiz Campelo dos Santos, Amanda Owings, Henry Socrates Lavalle Sullasi, Omer Gokcumen, Michael DeGiorgio, John Lindo

Abstract:

An increasing body of archaeological and genomic evidence have indicated a complex settlement process of the Americas. Here, four newly sequenced ancient genomes from Northeast Brazil and Uruguay are reported to share strong relationships with previously published samples from Panama and Southeast Brazil. Moreover, an unexpected high genomic affinity with present-day Onge is found in ancient individuals unearthed along the northern portion of South America’s Atlantic coast. These results provide genomic evidence for ancient migrations along South America’s Atlantic coast.

Keywords: archaeogenomics, atlantic coast, paleomigrations, South America

Procedia PDF Downloads 234

4046 Genomic Prediction Reliability Using Haplotypes Defined by Different Methods

Authors: Sohyoung Won, Heebal Kim, Dajeong Lim

Abstract:

Genomic prediction is an effective way to measure the abilities of livestock for breeding based on genomic estimated breeding values, statistically predicted values from genotype data using best linear unbiased prediction (BLUP). Using haplotypes, clusters of linked single nucleotide polymorphisms (SNPs), as markers instead of individual SNPs can improve the reliability of genomic prediction since the probability of a quantitative trait loci to be in strong linkage disequilibrium (LD) with markers is higher. To efficiently use haplotypes in genomic prediction, finding optimal ways to define haplotypes is needed. In this study, 770K SNP chip data was collected from Hanwoo (Korean cattle) population consisted of 2506 cattle. Haplotypes were first defined in three different ways using 770K SNP chip data: haplotypes were defined based on 1) length of haplotypes (bp), 2) the number of SNPs, and 3) k-medoids clustering by LD. To compare the methods in parallel, haplotypes defined by all methods were set to have comparable sizes; in each method, haplotypes defined to have an average number of 5, 10, 20 or 50 SNPs were tested respectively. A modified GBLUP method using haplotype alleles as predictor variables was implemented for testing the prediction reliability of each haplotype set. Also, conventional genomic BLUP (GBLUP) method, which uses individual SNPs were tested to evaluate the performance of the haplotype sets on genomic prediction. Carcass weight was used as the phenotype for testing. As a result, using haplotypes defined by all three methods showed increased reliability compared to conventional GBLUP. There were not many differences in the reliability between different haplotype defining methods. The reliability of genomic prediction was highest when the average number of SNPs per haplotype was 20 in all three methods, implying that haplotypes including around 20 SNPs can be optimal to use as markers for genomic prediction. When the number of alleles generated by each haplotype defining methods was compared, clustering by LD generated the least number of alleles. Using haplotype alleles for genomic prediction showed better performance, suggesting improved accuracy in genomic selection. The number of predictor variables was decreased when the LD-based method was used while all three haplotype defining methods showed similar performances. This suggests that defining haplotypes based on LD can reduce computational costs and allows efficient prediction. Finding optimal ways to define haplotypes and using the haplotype alleles as markers can provide improved performance and efficiency in genomic prediction.

Keywords: best linear unbiased predictor, genomic prediction, haplotype, linkage disequilibrium

Procedia PDF Downloads 138

4045 Sparse Modelling of Cancer Patients’ Survival Based on Genomic Copy Number Alterations

Authors: Khaled M. Alqahtani

Abstract:

Copy number alterations (CNA) are variations in the structure of the genome, where certain regions deviate from the typical two chromosomal copies. These alterations are pivotal in understanding tumor progression and are indicative of patients' survival outcomes. However, effectively modeling patients' survival based on their genomic CNA profiles while identifying relevant genomic regions remains a statistical challenge. Various methods, such as the Cox proportional hazard (PH) model with ridge, lasso, or elastic net penalties, have been proposed but often overlook the inherent dependencies between genomic regions, leading to results that are hard to interpret. In this study, we enhance the elastic net penalty by incorporating an additional penalty that accounts for these dependencies. This approach yields smooth parameter estimates and facilitates variable selection, resulting in a sparse solution. Our findings demonstrate that this method outperforms other models in predicting survival outcomes, as evidenced by our simulation study. Moreover, it allows for a more meaningful interpretation of genomic regions associated with patients' survival. We demonstrate the efficacy of our approach using both real data from a lung cancer cohort and simulated datasets.

Keywords: copy number alterations, cox proportional hazard, lung cancer, regression, sparse solution

Procedia PDF Downloads 40

4044 Genomic Resilience and Ecological Vulnerability in Coffea Arabica: Insights from Whole Genome Resequencing at Its Center of Origin

Authors: Zewdneh Zana Zate

Abstract:

The study focuses on the evolutionary and ecological genomics of both wild and cultivated Coffea arabica L. at its center of origin, Ethiopia, aiming to uncover how this vital species may withstand future climate changes. Utilizing bioclimatic models, we project the future distribution of Arabica under varied climate scenarios for 2050 and 2080, identifying potential conservation zones and immediate risk areas. Through whole-genome resequencing of accessions from Ethiopian gene banks, this research assesses genetic diversity and divergence between wild and cultivated populations. It explores relationships, demographic histories, and potential hybridization events among Coffea arabica accessions to better understand the species' origins and its connection to parental species. This genomic analysis also seeks to detect signs of natural or artificial selection across populations. Integrating these genomic discoveries with ecological data, the study evaluates the current and future ecological and genomic vulnerabilities of wild Coffea arabica, emphasizing necessary adaptations for survival. We have identified key genomic regions linked to environmental stress tolerance, which could be crucial for breeding more resilient Arabica varieties. Additionally, our ecological modeling predicted a contraction of suitable habitats, urging immediate conservation actions in identified key areas. This research not only elucidates the evolutionary history and adaptive strategies of Arabica but also informs conservation priorities and breeding strategies to enhance resilience to climate change. By synthesizing genomic and ecological insights, we provide a robust framework for developing effective management strategies aimed at sustaining Coffea arabica, a species of profound global importance, in its native habitat under evolving climatic conditions.

Keywords: coffea arabica, climate change adaptation, conservation strategies, genomic resilience

Procedia PDF Downloads 36

4043 Evaluation of Four Different DNA Targets in Polymerase Chain Reaction for Detection and Genotyping of Helicobacter pylori

Authors: Abu Salim Mustafa

Abstract:

Polymerase chain reaction (PCR) assays targeting genomic DNA segments have been established for the detection of Helicobacter pylori in clinical specimens. However, the data on comparative evaluations of various targets in detection of H. pylori are limited. Furthermore, the frequencies of vacA (s1 and s2) and cagA genotypes, which are suggested to be involved in the pathogenesis of H. pylori in other parts of the world, are not well studied in Kuwait. The aim of this study was to evaluate PCR assays for the detection and genotyping of H. pylori by targeting the amplification of DNA targets from four genomic segments. The genomic DNA were isolated from 72 clinical isolates of H. pylori and tested in PCR with four pairs of oligonucleotides primers, i.e. ECH-U/ECH-L, ET-5U/ET-5L, CagAF/CagAR and Vac1F/Vac1XR, which were expected to amplify targets of various sizes (471 bp, 230 bp, 183 bp and 176/203 bp, respectively) from the genomic DNA of H. pylori. The PCR-amplified DNA were analyzed by agarose gel electrophoresis. PCR products of expected size were obtained with all primer pairs by using genomic DNA isolated from H. pylori. DNA dilution experiments showed that the most sensitive PCR target was 471 bp DNA amplified by the primers ECH-U/ECH-L, followed by the targets of Vac1F/Vac1XR (176 bp/203 DNA), CagAF/CagAR (183 bp DNA) and ET-5U/ET-5L (230 bp DNA). However, when tested with undiluted genomic DNA isolated from single colonies of all isolates, the Vac1F/Vac1XR target provided the maximum positive results (71/72 (99% positives)), followed by ECH-U/ECH-L (69/72 (93% positives)), ET-5U/ET-5L (51/72 (71% positives)) and CagAF/CagAR (26/72 (46% positives)). The results of genotyping experiments showed that vacA s1 (46% positive) and vacA s2 (54% positive) genotypes were almost equally associated with VaCA+/CagA- isolates (P > 0.05), but with VacA+/CagA+ isolates, S1 genotype (92% positive) was more frequently detected than S2 genotype (8% positive) (P< 0.0001). In conclusion, among the primer pairs tested, Vac1F/Vac1XR provided the best results for detection of H. pylori. The genotyping experiments showed that vacA s1 and vacA s2 genotypes were almost equally associated with vaCA⁺/cagA^-isolates, but vacA s1 genotype had a significantly increased association with vacA⁺/cagA⁺isolates.

Keywords: H. pylori, PCR, detection, genotyping

Procedia PDF Downloads 127

4042 Genomic Analysis of Whole Genome Sequencing of Leishmania Major

Authors: Fatimazahrae Elbakri, Azeddine Ibrahimi, Meryem Lemrani, Dris Belghyti

Abstract:

Leishmaniasis represents a major public health problem because of the number of cases recorded each year and the wide distribution of the disease. It is a parasitic disease of flagellated protozoa transmitted by the bite of certain species of sandfly, causing a spectrum of clinical pathology in humans ranging from disfiguring skin lesions to fatal visceral leishmaniasis. Cutaneous leishmaniasis due to Leishmania major is a polymorphic disease; in fact, the infection can be asymptomatic, localized, or disseminated. The objective of this work is to determine the genomic diversity that contributes to clinical variability by trying to identify the variation in chromosome number and to extract SNPs and SNPs and InDels; it is based on four sequences (WGS) of Leishmania major available on NCBI in Fastq form, from three countries: Tunisia, Algeria, and Israel, the analysis is set up from a pipeline to facilitate the discovery of genetic diversity, in particular SNP and chromosomal somy.

Keywords: Leshmania major, cutaneous Leishmania, NGS, genomic, somy, variant calling

Procedia PDF Downloads 74

4041 Resequencing and Genomic Study of Wild Coffea Arabica Unveils Genetic Groups at Its Origin and Their Geographic Distribution

Authors: Zate Zewdneh Zana

Abstract:

Coffea arabica (Arabica coffee), a cornerstone of the global beverage industry, necessitates rigorous genetic conservation due to its economic significance and genetic complexity. In this study, we performed whole-genome resequencing of wild species collected from its birthplace, Ethiopia. Advanced Illumina sequencing technology facilitated the mapping of a high percentage of clean reads to the C. arabica reference genome, revealing a substantial number of genetic variants, predominantly SNPs. Our comprehensive analysis not only uncovered a notable distribution of genomic variants across the coffee genome but also identified distinct genetic groups through phylogenetic and population structure analyses. This genomic study provides invaluable insights into the genetic diversity of C. arabica, highlighting the potential of identified SNPs and InDels in enhancing our understanding of key agronomic traits. The findings contribute significantly to genetic studies and support strategic breeding and conservation efforts essential for sustaining the global coffee industry.

Keywords: population genetics, wild species, evolutionary study, coffee plant

Procedia PDF Downloads 30

4040 Genetic Instabilities in Marine Bivalve Following Benzo(α)pyrene Exposure: Utilization of Combined Random Amplified Polymorphic DNA and Comet Assay

Authors: Mengjie Qu, Yi Wang, Jiawei Ding, Siyu Chen, Yanan Di

Abstract:

Marine ecosystem is facing intensified multiple stresses caused by environmental contaminants from human activities. Xenobiotics, such as benzo(α)pyrene (BaP) have been discharged into marine environment and cause hazardous impacts on both marine organisms and human beings. As a filter-feeder, marine mussels, Mytilus spp., has been extensively used to monitor the marine environment. However, their genomic alterations induced by such xenobiotics are still kept unknown. In the present study, gills, as the first defense barrier in mussels, were selected to evaluate the genetic instability alterations induced by the exposure to BaP both in vivo and in vitro. Both random amplified polymorphic DNA (RAPD) assay and comet assay were applied as the rapid tools to assess the environmental stresses due to their low money- and time-consumption. All mussels were identified to be the single species of Mytilus coruscus before used in BaP exposure at the concentration of 56 μg/l for 1 & 3 days (in vivo exposure) or 1 & 3 hours (in vitro). Both RAPD and comet assay results were showed significantly increased genomic instability with time-specific altering pattern. After the recovery period in 'in vivo' exposure, the genomic status was as same as control condition. However, the relative higher genomic instabilities were still observed in gill cells after the recovery from in vitro exposure condition. Different repair mechanisms or signaling pathway might be involved in the isolated gill cells in the comparison with intact tissues. The study provides the robust and rapid techniques to exam the genomic stability in marine organisms in response to marine environmental changes and provide basic information for further mechanism research in stress responses in marine organisms.

Keywords: genotoxic impacts, in vivo/vitro exposure, marine mussels, RAPD and comet assay

Procedia PDF Downloads 273

4039 Analysis of Saudi Breast Cancer Patients’ Primary Tumors using Array Comparative Genomic Hybridization

Authors: L. M. Al-Harbi, A. M. Shokry, J. S. M. Sabir, A. Chaudhary, J. Manikandan, K. S. Saini

Abstract:

Breast cancer is the second most common cause of cancer death worldwide and is the most common malignancy among Saudi females. During breast carcinogenesis, a wide-array of cytogenetic changes involving deletions, or amplification, or translocations, of part or whole of chromosome regions have been observed. Because of the limitations of various earlier technologies, newer tools are developed to scan for changes at the genomic level. Recently, Array Comparative Genomic Hybridization (aCGH) technique has been applied for detecting segmental genomic alterations at molecular level. In this study, aCGH was performed on twenty breast cancer tumors and their matching non-tumor (normal) counterparts using the Agilent 2x400K. Several regions were identified to be either amplified or deleted in a tumor-specific manner. Most frequent alterations were amplification of chromosome 1q, chromosome 8q, 20q, and deletions at 16q were also detected. The amplification of genetic events at 1q and 8q were further validated using FISH analysis using probes targeting 1q25 and 8q (MYC gene). The copy number changes at these loci can potentially cause a significant change in the tumor behavior, as deletions in the E-Cadherin (CDH1)-tumor suppressor gene as well as amplification of the oncogenes-Aurora Kinase A. (AURKA) and MYC could make these tumors highly metastatic. This study validates the use of aCGH in Saudi breast cancer patients and sets the foundations necessary for performing larger cohort studies searching for ethnicity-specific biomarkers and gene copy number variations.

Keywords: breast cancer, molecular biology, ecology, environment

Procedia PDF Downloads 371

4038 A Hybrid Feature Selection and Deep Learning Algorithm for Cancer Disease Classification

Authors: Niousha Bagheri Khulenjani, Mohammad Saniee Abadeh

Abstract:

Learning from very big datasets is a significant problem for most present data mining and machine learning algorithms. MicroRNA (miRNA) is one of the important big genomic and non-coding datasets presenting the genome sequences. In this paper, a hybrid method for the classification of the miRNA data is proposed. Due to the variety of cancers and high number of genes, analyzing the miRNA dataset has been a challenging problem for researchers. The number of features corresponding to the number of samples is high and the data suffer from being imbalanced. The feature selection method has been used to select features having more ability to distinguish classes and eliminating obscures features. Afterward, a Convolutional Neural Network (CNN) classifier for classification of cancer types is utilized, which employs a Genetic Algorithm to highlight optimized hyper-parameters of CNN. In order to make the process of classification by CNN faster, Graphics Processing Unit (GPU) is recommended for calculating the mathematic equation in a parallel way. The proposed method is tested on a real-world dataset with 8,129 patients, 29 different types of tumors, and 1,046 miRNA biomarkers, taken from The Cancer Genome Atlas (TCGA) database.

Keywords: cancer classification, feature selection, deep learning, genetic algorithm

Procedia PDF Downloads 108

4037 Isolation and Identification of Diacylglycerol Acyltransferase Type-2 (GAT2) Genes from Three Egyptian Olive Cultivars

Authors: Yahia I. Mohamed, Ahmed I. Marzouk, Mohamed A. Yacout

Abstract:

Aim of this work was to study the genetic basis for oil accumulation in olive fruit via tracking DGAT2 (Diacylglycerol acyltransferase type-2) gene in three Egyptian Origen Olive cultivars namely Toffahi, Hamed and Maraki using molecular marker techniques and bioinformatics tools. Results illustrate that, firstly: specific genomic band of Maraki cultivars was identified as DGAT2 (Diacylglycerol acyltransferase type-2) and identical for this gene in Olea europaea with 100 % of similarity. Secondly, differential genomic band of Maraki cultivars which produced from RAPD fingerprinting technique reflected predicted distinguished sequence which identified as DGAT2 (Diacylglycerol acyltransferase type-2) in Fragaria vesca subsp. Vesca with 76% of sequential similarity. Third and finally, specific genomic specific band of Hamed cultivars was indentified as two fragments, 1-Olea europaea cultivar Koroneiki diacylglycerol acyltransferase type 2 mRNA, complete cds with two matches regions with 99% or 2-PREDICTED: Fragaria vesca subsp. vesca diacylglycerol O-acyltransferase 2-like (LOC101313050), mRNA with 86% of similarity.

Keywords: Olea europaea, fingerprinting, diacylglycerol acyltransferase type-2 (DGAT2), Egypt

Procedia PDF Downloads 495

4036 Evolutionary Genomic Analysis of Adaptation Genomics

Authors: Agostinho Antunes

Abstract:

The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of varied species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

Keywords: adaptation, animals, evolution, genomics

Procedia PDF Downloads 426

4035 Suppression Subtractive Hybridization Technique for Identification of the Differentially Expressed Genes

Authors: Tuhina-khatun, Mohamed Hanafi Musa, Mohd Rafii Yosup, Wong Mui Yun, Aktar-uz-Zaman, Mahbod Sahebi

Abstract:

Suppression subtractive hybridization (SSH) method is valuable tool for identifying differentially regulated genes in disease specific or tissue specific genes important for cellular growth and differentiation. It is a widely used method for separating DNA molecules that distinguish two closely related DNA samples. SSH is one of the most powerful and popular methods for generating subtracted cDNA or genomic DNA libraries. It is based primarily on a suppression polymerase chain reaction (PCR) technique and combines normalization and subtraction in a solitary procedure. The normalization step equalizes the abundance of DNA fragments within the target population, and the subtraction step excludes sequences that are common to the populations being compared. This dramatically increases the probability of obtaining low-abundance differentially expressed cDNAs or genomic DNA fragments and simplifies analysis of the subtracted library. SSH technique is applicable to many comparative and functional genetic studies for the identification of disease, developmental, tissue specific, or other differentially expressed genes, as well as for the recovery of genomic DNA fragments distinguishing the samples under comparison.

Keywords: suppression subtractive hybridization, differentially expressed genes, disease specific genes, tissue specific genes

Procedia PDF Downloads 427

4034 Benefit Sharing of Research Participants in Human Genomic Research: Ethical Concerns and Ramifications

Authors: Tamanda Kamwendo

Abstract:

The concept of benefit sharing has been a prominent global debate in the world, gaining traction in human research ethics. Despite its prevalence, the concept of benefit sharing is not without controversy over its meaning and justification. This is due to the fact that it lacks a broadly accepted definition and many proponents discuss benefit sharing by arguing for its necessity rather than engaging in critical intellectual engagement with technical issues such as what it implies. What is clear in the literature is that the underlying premise of benefit-sharing is that research involving underprivileged and marginalized people is currently unjust and inequitable because these people are denied access to these gains; thus, benefit-sharing arrangements are required for these research projects to be just and equitable. This paper, therefore, investigates the discourses and justifications behind the concept of benefit sharing to human participants, particularly when dealing with human genomics research. Furthermore, considering that benefit sharing is generally viewed as a transaction between research organizations and research participants, it raises ethical concerns concerning the commodification of human material and undermines the sanctity of the human genome. This is predicated on the idea that research sponsors would be compelled to deliver a minimum set of possible benefits to research participants and communities in exchange for their involvement in the study. There is, therefore, need to protect benefit-sharing practices in international health research by developing a governance legal framework. A legal framework of benefit sharing will also dispel the issue of commodification of human material where human genomic research is done.

Keywords: benefit sharing, human participants, human genomic research, ethical concerns

Procedia PDF Downloads 71

4033 Relevant LMA Features for Human Motion Recognition

Authors: Insaf Ajili, Malik Mallem, Jean-Yves Didier

Abstract:

Motion recognition from videos is actually a very complex task due to the high variability of motions. This paper describes the challenges of human motion recognition, especially motion representation step with relevant features. Our descriptor vector is inspired from Laban Movement Analysis method. We propose discriminative features using the Random Forest algorithm in order to remove redundant features and make learning algorithms operate faster and more effectively. We validate our method on MSRC-12 and UTKinect datasets.

Keywords: discriminative LMA features, features reduction, human motion recognition, random forest

Procedia PDF Downloads 189

4032 Genomics of Adaptation in the Sea

Authors: Agostinho Antunes

Abstract:

The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of selected marine animal species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

Keywords: marine genomics, evolutionary bioinformatics, human genome sequencing, genomic analyses

Procedia PDF Downloads 605

4031 Cytogenetic Characterization of the VERO Cell Line Based on Comparisons with the Subline; Implication for Authorization and Quality Control of Animal Cell Lines

Authors: Fumio Kasai, Noriko Hirayama, Jorge Pereira, Azusa Ohtani, Masashi Iemura, Malcolm A. Ferguson Smith, Arihiro Kohara

Abstract:

The VERO cell line was established in 1962 from normal tissue of an African green monkey, Chlorocebus aethiops (2n=60), and has been commonly used worldwide for screening for toxins or as a cell substrate for the production of viral vaccines. The VERO genome was sequenced in 2014; however, its cytogenetic features have not been fully characterized as it contains several chromosome abnormalities and different karyotypes coexist in the cell line. In this study, the VERO cell line (JCRB0111) was compared with one of the sublines. In contrast to 59 chromosomes as the modal chromosome number in the VERO cell line, the subline had two peaks of 56 and 58 chromosomes. M-FISH analysis using human probes revealed that the VERO cell line was characterized by a translocation t(2;25) found in all metaphases, which was absent in the subline. Different abnormalities detected only in the subline show that the cell line is heterogeneous, indicating that the subline has the potential to change its genomic characteristics during cell culture. The various alterations in the two independent lineages suggest that genomic changes in both VERO cells can be accounted for by progressive rearrangements during their evolution in culture. Both t(5;X) and t(8;14) observed in all metaphases of the two cell lines might have a key role in VERO cells and could be used as genetic markers to identify VERO cells. The flow karyotype shows distinct differences from normal. Further analysis of sorted abnormal chromosomes may uncover other characteristics of VERO cells. Because of the absence of STR data, cytogenetic data are important in characterizing animal cell lines and can be an indicator of their quality control.

Keywords: VERO, cell culture passage, chromosome rearrangement, heterogeneous cells

Procedia PDF Downloads 410

4030 Antibody Reactivity of Synthetic Peptides Belonging to Proteins Encoded by Genes Located in Mycobacterium tuberculosis-Specific Genomic Regions of Differences

Authors: Abu Salim Mustafa

Abstract:

The comparisons of mycobacterial genomes have identified several Mycobacterium tuberculosis-specific genomic regions that are absent in other mycobacteria and are known as regions of differences. Due to M. tuberculosis-specificity, the peptides encoded by these regions could be useful in the specific diagnosis of tuberculosis. To explore this possibility, overlapping synthetic peptides corresponding to 39 proteins predicted to be encoded by genes present in regions of differences were tested for antibody-reactivity with sera from tuberculosis patients and healthy subjects. The results identified four immunodominant peptides corresponding to four different proteins, with three of the peptides showing significantly stronger antibody reactivity and rate of positivity with sera from tuberculosis patients than healthy subjects. The fourth peptide was recognized equally well by the sera of tuberculosis patients as well as healthy subjects. Predication of antibody epitopes by bioinformatics analyses using ABCpred server predicted multiple linear epitopes in each peptide. Furthermore, peptide sequence analysis for sequence identity using BLAST suggested M. tuberculosis-specificity for the three peptides that had preferential reactivity with sera from tuberculosis patients, but the peptide with equal reactivity with sera of TB patients and healthy subjects showed significant identity with sequences present in nob-tuberculous mycobacteria. The three identified M. tuberculosis-specific immunodominant peptides may be useful in the serological diagnosis of tuberculosis.

Keywords: genomic regions of differences, Mycobacterium tuberculossis, peptides, serodiagnosis

Procedia PDF Downloads 178

4029 Impact of Variability in Delineation on PET Radiomics Features in Lung Tumors

Authors: Mahsa Falahatpour

Abstract:

Introduction: This study aims to explore how inter-observer variability in manual tumor segmentation impacts the reliability of radiomic features in non–small cell lung cancer (NSCLC). Methods: The study included twenty-three NSCLC tumors. Each patient had three tumor segmentations (VOL1, VOL2, VOL3) contoured on PET/CT scans by three radiation oncologists. Dice coefficients (DCS) were used to measure the segmentation variability. Radiomic features were extracted with 3D-slicer software, consisting of 66 features: first-order (n=15), second-order (GLCM, GLDM, GLRLM, and GLSZM) (n=33). The inter-observer variability of radiomic features was assessed using the intraclass correlation coefficient (ICC). An ICC > 0.8 indicates good stability. Results: The mean DSC of VOL1, VOL2, and VOL3 was 0.80 ± 0.04, 0.85 ± 0.03, and 0.76 ± 0.06, respectively. 92% of all extracted radiomic features were found to be stable (ICC > 0.8). The GLCM texture features had the highest stability (96%), followed by GLRLM features (90%) and GLSZM features (87%). The DSC was found to be highly correlated with the stability of radiomic features. Conclusion: The variability in inter-observer segmentation significantly impacts radiomics analysis, leading to a reduction in the number of appropriate radiomic features.

Keywords: PET/CT, radiomics, radiotherapy, segmentation, NSCLC

Procedia PDF Downloads 30

4028 Genomic Sequence Representation Learning: An Analysis of K-Mer Vector Embedding Dimensionality

Authors: James Jr. Mashiyane, Risuna Nkolele, Stephanie J. Müller, Gciniwe S. Dlamini, Rebone L. Meraba, Darlington S. Mapiye

Abstract:

When performing language tasks in natural language processing (NLP), the dimensionality of word embeddings is chosen either ad-hoc or is calculated by optimizing the Pairwise Inner Product (PIP) loss. The PIP loss is a metric that measures the dissimilarity between word embeddings, and it is obtained through matrix perturbation theory by utilizing the unitary invariance of word embeddings. Unlike in natural language, in genomics, especially in genome sequence processing, unlike in natural language processing, there is no notion of a “word,” but rather, there are sequence substrings of length k called k-mers. K-mers sizes matter, and they vary depending on the goal of the task at hand. The dimensionality of word embeddings in NLP has been studied using the matrix perturbation theory and the PIP loss. In this paper, the sufficiency and reliability of applying word-embedding algorithms to various genomic sequence datasets are investigated to understand the relationship between the k-mer size and their embedding dimension. This is completed by studying the scaling capability of three embedding algorithms, namely Latent Semantic analysis (LSA), Word2Vec, and Global Vectors (GloVe), with respect to the k-mer size. Utilising the PIP loss as a metric to train embeddings on different datasets, we also show that Word2Vec outperforms LSA and GloVe in accurate computing embeddings as both the k-mer size and vocabulary increase. Finally, the shortcomings of natural language processing embedding algorithms in performing genomic tasks are discussed.

Keywords: word embeddings, k-mer embedding, dimensionality reduction

Procedia PDF Downloads 133

4027 Tree Species Classification Using Effective Features of Polarimetric SAR and Hyperspectral Images

Authors: Milad Vahidi, Mahmod R. Sahebi, Mehrnoosh Omati, Reza Mohammadi

Abstract:

Forest management organizations need information to perform their work effectively. Remote sensing is an effective method to acquire information from the Earth. Two datasets of remote sensing images were used to classify forested regions. Firstly, all of extractable features from hyperspectral and PolSAR images were extracted. The optical features were spectral indexes related to the chemical, water contents, structural indexes, effective bands and absorption features. Also, PolSAR features were the original data, target decomposition components, and SAR discriminators features. Secondly, the particle swarm optimization (PSO) and the genetic algorithms (GA) were applied to select optimization features. Furthermore, the support vector machine (SVM) classifier was used to classify the image. The results showed that the combination of PSO and SVM had higher overall accuracy than the other cases. This combination provided overall accuracy about 90.56%. The effective features were the spectral index, the bands in shortwave infrared (SWIR) and the visible ranges and certain PolSAR features.

Keywords: hyperspectral, PolSAR, feature selection, SVM

Procedia PDF Downloads 411

4026 Biophysical Characterization of Archaeal Cyclophilin Like Chaperone Protein

Authors: Vineeta Kaushik, Manisha Goel

Abstract:

Chaperones are proteins that help other proteins fold correctly, and are found in all domains of life i.e., prokaryotes, eukaryotes and archaea. Various comparative genomic studies have suggested that the archaeal protein folding machinery appears to be highly similar to that found in eukaryotes. In case of protein folding; slow rotation of peptide prolyl-imide bond is often the rate limiting step. Formation of the prolyl-imide bond during the folding of a protein requires the assistance of other proteins, termed as peptide prolyl cis-trans isomerases (PPIases). Cyclophilins constitute the class of peptide prolyl isomerases with a wide range of biological function like protein folding, signaling and chaperoning. Most of the cyclophilins exhibit PPIase enzymatic activity and play active role in substrate protein folding which classifies them as a category of molecular chaperones. Till date, there is not very much data available in the literature on archaeal cyclophilins. We aim to compare the structural and biochemical features of the cyclophilin protein from within the three domains to elucidate the features affecting their stability and enzyme activity. In the present study, we carry out in-silico analysis of the cyclophilin proteins to predict their conserved residues, sites under positive selection and compare these proteins to their bacterial and eukaryotic counterparts to predict functional divergence. We also aim to clone and express these proteins in heterologous system and study their biophysical characteristics in detail using techniques like CD and fluorescence spectroscopy. Overall we aim to understand the features contributing to the folding, stability and dynamics of the archaeal cyclophilin proteins.

Keywords: biophysical characterization, x-ray crystallography, chaperone-like activity, cyclophilin, PPIase activity

Procedia PDF Downloads 209

4025 Active Features Determination: A Unified Framework

Authors: Meenal Badki

Abstract:

We address the issue of active feature determination, where the objective is to determine the set of examples on which additional data (such as lab tests) needs to be gathered, given a large number of examples with some features (such as demographics) and some examples with all the features (such as the complete Electronic Health Record). We note that certain features may be more costly, unique, or laborious to gather. Our proposal is a general active learning approach that is independent of classifiers and similarity metrics. It allows us to identify examples that differ from the full data set and obtain all the features for the examples that match. Our comprehensive evaluation shows the efficacy of this approach, which is driven by four authentic clinical tasks.

Keywords: feature determination, classification, active learning, sample-efficiency

Procedia PDF Downloads 67

4024 Single Cell and Spatial Transcriptomics: A Beginners Viewpoint from the Conceptual Pipeline

Authors: Leo Nnamdi Ozurumba-Dwight

Abstract:

Messenger ribooxynucleic acid (mRNA) molecules are compositional, protein-based. These proteins, encoding mRNA molecules (which collectively connote the transcriptome), when analyzed by RNA sequencing (RNAseq), unveils the nature of gene expression in the RNA. The obtained gene expression provides clues of cellular traits and their dynamics in presentations. These can be studied in relation to function and responses. RNAseq is a practical concept in Genomics as it enables detection and quantitative analysis of mRNA molecules. Single cell and spatial transcriptomics both present varying avenues for expositions in genomic characteristics of single cells and pooled cells in disease conditions such as cancer, auto-immune diseases, hematopoietic based diseases, among others, from investigated biological tissue samples. Single cell transcriptomics helps conduct a direct assessment of each building unit of tissues (the cell) during diagnosis and molecular gene expressional studies. A typical technique to achieve this is through the use of a single-cell RNA sequencer (scRNAseq), which helps in conducting high throughput genomic expressional studies. However, this technique generates expressional gene data for several cells which lack presentations on the cells’ positional coordinates within the tissue. As science is developmental, the use of complimentary pre-established tissue reference maps using molecular and bioinformatics techniques has innovatively sprung-forth and is now used to resolve this set back to produce both levels of data in one shot of scRNAseq analysis. This is an emerging conceptual approach in methodology for integrative and progressively dependable transcriptomics analysis. This can support in-situ fashioned analysis for better understanding of tissue functional organization, unveil new biomarkers for early-stage detection of diseases, biomarkers for therapeutic targets in drug development, and exposit nature of cell-to-cell interactions. Also, these are vital genomic signatures and characterizations of clinical applications. Over the past decades, RNAseq has generated a wide array of information that is igniting bespoke breakthroughs and innovations in Biomedicine. On the other side, spatial transcriptomics is tissue level based and utilized to study biological specimens having heterogeneous features. It exposits the gross identity of investigated mammalian tissues, which can then be used to study cell differentiation, track cell line trajectory patterns and behavior, and regulatory homeostasis in disease states. Also, it requires referenced positional analysis to make up of genomic signatures that will be sassed from the single cells in the tissue sample. Given these two presented approaches to RNA transcriptomics study in varying quantities of cell lines, with avenues for appropriate resolutions, both approaches have made the study of gene expression from mRNA molecules interesting, progressive, developmental, and helping to tackle health challenges head-on.

Keywords: transcriptomics, RNA sequencing, single cell, spatial, gene expression.

Procedia PDF Downloads 117

4023 2D Point Clouds Features from Radar for Helicopter Classification

Authors: Danilo Habermann, Aleksander Medella, Carla Cremon, Yusef Caceres

Abstract:

This paper aims to analyze the ability of 2d point clouds features to classify different models of helicopters using radars. This method does not need to estimate the blade length, the number of blades of helicopters, and the period of their micro-Doppler signatures. It is also not necessary to generate spectrograms (or any other image based on time and frequency domain). This work transforms a radar return signal into a 2D point cloud and extracts features of it. Three classifiers are used to distinguish 9 different helicopter models in order to analyze the performance of the features used in this work. The high accuracy obtained with each of the classifiers demonstrates that the 2D point clouds features are very useful for classifying helicopters from radar signal.

Keywords: helicopter classification, point clouds features, radar, supervised classifiers

Procedia PDF Downloads 217

4022 Dynamic Gabor Filter Facial Features-Based Recognition of Emotion in Video Sequences

Authors: T. Hari Prasath, P. Ithaya Rani

Abstract:

In the world of visual technology, recognizing emotions from the face images is a challenging task. Several related methods have not utilized the dynamic facial features effectively for high performance. This paper proposes a method for emotions recognition using dynamic facial features with high performance. Initially, local features are captured by Gabor filter with different scale and orientations in each frame for finding the position and scale of face part from different backgrounds. The Gabor features are sent to the ensemble classifier for detecting Gabor facial features. The region of dynamic features is captured from the Gabor facial features in the consecutive frames which represent the dynamic variations of facial appearances. In each region of dynamic features is normalized using Z-score normalization method which is further encoded into binary pattern features with the help of threshold values. The binary features are passed to Multi-class AdaBoost classifier algorithm with the well-trained database contain happiness, sadness, surprise, fear, anger, disgust, and neutral expressions to classify the discriminative dynamic features for emotions recognition. The developed method is deployed on the Ryerson Multimedia Research Lab and Cohn-Kanade databases and they show significant performance improvement owing to their dynamic features when compared with the existing methods.

Keywords: detecting face, Gabor filter, multi-class AdaBoost classifier, Z-score normalization

Procedia PDF Downloads 269

4021 Genomic Adaptation to Local Climate Conditions in Native Cattle Using Whole Genome Sequencing Data

Authors: Rugang Tian

Abstract:

In this study, we generated whole-genome sequence (WGS) data from110 native cattle. Together with whole-genome sequences from world-wide cattle populations, we estimated the genetic diversity and population genetic structure of different cattle populations. Our findings revealed clustering of cattle groups in line with their geographic locations. We identified noticeable genetic diversity between indigenous cattle breeds and commercial populations. Among all studied cattle groups, lower genetic diversity measures were found in commercial populations, however, high genetic diversity were detected in some local cattle, particularly in Rashoki and Mongolian breeds. Our search for potential genomic regions under selection in native cattle revealed several candidate genes related with immune response and cold shock protein on multiple chromosomes such as TRPM8, NMUR1, PRKAA2, SMTNL2 and OXR1 that are involved in energy metabolism and metabolic homeostasis.

Keywords: cattle, whole-genome, population structure, adaptation

Procedia PDF Downloads 62

4020 New Features for Copy-Move Image Forgery Detection

Authors: Michael Zimba

Abstract:

A novel set of features for copy-move image forgery, CMIF, detection method is proposed. The proposed set presents a new approach which relies on electrostatic field theory, EFT. Solely for the purpose of reducing the dimension of a suspicious image, firstly performs discrete wavelet transform, DWT, of the suspicious image and extracts only the approximation subband. The extracted subband is then bijectively mapped onto a virtual electrostatic field where concepts of EFT are utilised to extract robust features. The extracted features are shown to be invariant to additive noise, JPEG compression, and affine transformation. The proposed features can also be used in general object matching.

Keywords: virtual electrostatic field, features, affine transformation, copy-move image forgery

Procedia PDF Downloads 539