Search results for: massively parallel sequencing (MPS)
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1794

Search results for: massively parallel sequencing (MPS)

1644 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 142
1643 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 129
1642 Whole Exome Sequencing in Characterizing Mysterious Crippling Disorder in India

Authors: Swarkar Sharma, Ekta Rai, Ankit Mahajan, Parvinder Kumar, Manoj K Dhar, Sushil Razdan, Kumarasamy Thangaraj, Carol Wise, Shiro Ikegawa M.D., K.K. Pandita M.D.

Abstract:

Rare disorders are poorly understood hence, remain uncharacterized or patients are misdiagnosed and get poor medical attention. A rare mysterious skeletal disorder that remained unidentified for decades and rendered many people physically challenged and disabled for life has been reported in an isolated remote village ‘Arai’ of Poonch district of Jammu and Kashmir. This village is located deep in mountains and the population residing in the region is highly consanguineous. In our survey of the region, 70 affected people were reported, showing similar phenotype, in the village with a population of approximately 5000 individuals. We were able to collect samples from two multi generational extended families from the village. Through Whole Exome sequencing (WES), we identified a rare variation NM_003880.3:c.156C>A NP_003871.1:p.Cys52Ter, which results in introduction of premature stop codon in WISP3 gene. We found this variation perfectly segregating with the disease in one of the family. However, this variation was absent in other family. Interestingly, a novel splice site mutation at position c.643+1G>A of WISP3 gene, perfectly segregating with the disease was observed in the second family. Thus, exploiting WES and putting different evidences together (familial histories and genetic data, clinical features, radiological and biochemical tests and findings), the disease has finally been diagnosed as a very rare recessive hereditary skeletal disease “Progressive Pseudorheumatoid Arthropathy of Childhood” (PPAC) also known as “Spondyloepiphyseal Dysplasia Tarda with Progressive Arthropathy” (SEDT-PA). This genetic characterization and identification of the disease causing mutations will aid in genetic counseling, critically required to curb this rare disorder and to prevent its appearance in future generations in the population. Further, understanding of the role of WISP3 gene the biological pathways should help in developing treatment for the disorder.

Keywords: whole exome sequencing, Next Generation Sequencing, rare disorders

Procedia PDF Downloads 394
1641 Mutations in rpoB, katG and inhA Genes: The Association with Resistance to Rifampicin and Isoniazid in Egyptian Mycobacterium tuberculosis Clinical Isolates

Authors: Ayman K. El Essawy, Amal M. Hosny, Hala M. Abu Shady

Abstract:

The rapid detection of TB and drug resistance, both optimizes treatment and improves outcomes. In the current study, respiratory specimens were collected from 155 patients. Conventional susceptibility testing and MIC determination were performed for rifampicin (RIF) and isoniazid (INH). Genotype MTBDRplus assay, which is a molecular genetic assay based on the DNA-STRIP technology and specific gene sequencing with primers for rpoB, KatG, and mab-inhA genes were used to detect mutations associated with resistance to rifampicin and isoniazid. In comparison to other categories, most of rifampicin resistant (61.5%) and isoniazid resistant isolates (47.1%) were from patients relapsed in treatment. The genotypic profile (using Genotype MTBDRplus assay) of multi-drug resistant (MDR) isolates showed missing of katG wild type 1 (WT1) band and appearance of mutation band katG MUT2. For isoniazid mono-resistant isolates, 80% showed katG MUT1, 20% showed katG MUT1, and inhA MUT1, 20% showed only inhA MUT1. Accordingly, 100% of isoniazid resistant strains were detected by this assay. Out of 17 resistant strains, 16 had mutation bands for katG distinguished high resistance to isoniazid. The assay could clearly detect rifampicin resistance among 66.7% of MDR isolates that showed mutation band rpoB MUT3 while 33.3% of them were considered as unknown. One mono-resistant rifampicin isolate did not show rifampicin mutation bands by Genotype MTBDRplus assay, but it showed an unexpected mutation in Codon 531 of rpoB by DNA sequence analysis. Rifampicin resistance in this strain could be associated with a mutation in codon 531 of rpoB (based on molecular sequencing), and Genotype MTBDRplus assay could not detect the associated mutation. If the results of Genotype MTBDRplus assay and sequencing were combined, this strain shows hetero-resistance pattern. Gene sequencing of eight selected isolates, previously tested by Genotype MTBDRplus assay, could detect resistance mutations mainly in codon 315 (katG gene), position -15 in inhA promotes gene for isoniazid resistance and codon 531 (rpoB gene) for rifampicin resistance. Genotyping techniques allow distinguishing between recurrent cases of reinfection or reactivation and supports epidemiological studies.

Keywords: M. tuberculosis, rpoB, KatG, inhA, genotype MTBDRplus

Procedia PDF Downloads 138
1640 Biodegradation of Direct Red 23 by Bacterial Consortium Isolated from Dye Contaminated Soil Using Sequential Air-lift Bioreactor

Authors: Lata Kumari Dhanesh Tiwary, Pradeep Kumar Mishra

Abstract:

The effluent coming from various industries such as textile, carpet, food, pharmaceutical and many other industries is big challenge due to its recalcitrant and xenobiotiocs in nature. Recently, biodegradation of dye wastewater through biological means was widely used due to eco-friendly and cost effective with the higher percentage of removal of dye from wastewater. The present study deals with the biodegradation and decolourization of Direct Red 23 dye using indigenously isolated bacterial consortium. The bacterial consortium was isolated from soil sample from dye contaminated site near a cluster of Carpet industries of Bhadohi, Uttar Pradesh, India. The bacterial strain formed consortia were identified and characterized by morphological, biochemical and 16S rRNA gene sequence analysis. The bacterial strain mainly Staphylococcus saprophyticus strain BHUSS X3 (KJ439576), Microbacterium sp. BHUMSp X4 (KJ740222) and Staphylococcus saprophyticus strain BHUSS X5 (KJ439576) were used as consortia for further studies of dye decolorization. Experimental investigations were made in a Sequencing Air- lift bioreactor using the synthetic solution of Direct Red 23 dye by optimizing various parameters for efficient degradation of dye. The effect of several operating parameters such as flow rate, pH, temperature, initial dye concentration and inoculums size on removal of dye was investigated. The efficiency of isolated bacterial consortia from dye contaminated area in Sequencing Air- lift Bioreactor with different concentration of dye between 100-1200 mg/l at different hydraulic rate (HRTs) 26h and 10h. The maximum percentage of dye decolourization 98% was achieved when operated at HRT of 26h. The percentage of decolourization of dye was confirmed by using UV-Vis spectrophotometer and HPLC.

Keywords: carpet industry, bacterial consortia, sequencing air-lift bioreactor

Procedia PDF Downloads 317
1639 The Evolution Characteristics of Urban Ecological Patterns in Parallel Range-Valley Areas, China

Authors: Wen Feiming

Abstract:

As the ecological barrier of the Yangtze River, the ecological security of the Parallel Range-Valley area is very important. However, the unique geomorphic features aggravate the contradiction between man and land, resulting in the encroachment of ecological space. In recent years , relevant researches has focused on the single field of land science, ecology and landscape ecology, and it is difficult to systematically reflect the regularities of distribution and evolution trends of ecological patterns in the process of urban development. Therefore, from the perspective of "Production-Living-Ecological space", using spatial analysis methods such as Remote Sensing (RS) and Geographic Information Systems (GIS), this paper analyzes the evolution characteristics and driving factors of the ecological pattern of mountain towns in the parallel range-valley region from the aspects of land use structure, change rate, transformation relationship, and spatial correlation. It is concluded that the ecological pattern of mountain towns presents a trend from expansion and diffusion to agglomeration, and the dynamic spatial transfer is a trend from artificial transformation to the natural origin, while the driving effect analysis shows the significant characteristics of terrain attraction and construction barrier. Finally, combined with the evolution characteristics and driving mechanism, the evolution modes of "mountain area - concentrated growth", "trough area - diffusion attenuation" and "flat area - concentrated attenuation" are summarized, and the differentiated zoning and stratification ecological planning strategies are proposed here, in order to provide the theoretical basis for the sustainable development of mountain towns in parallel range-valley areas.

Keywords: parallel range-valley, ecological pattern, evolution characteristics, driving factors

Procedia PDF Downloads 70
1638 CMPD: Cancer Mutant Proteome Database

Authors: Po-Jung Huang, Chi-Ching Lee, Bertrand Chin-Ming Tan, Yuan-Ming Yeh, Julie Lichieh Chu, Tin-Wen Chen, Cheng-Yang Lee, Ruei-Chi Gan, Hsuan Liu, Petrus Tang

Abstract:

Whole-exome sequencing focuses on the protein coding regions of disease/cancer associated genes based on a priori knowledge is the most cost-effective method to study the association between genetic alterations and disease. Recent advances in high throughput sequencing technologies and proteomic techniques has provided an opportunity to integrate genomics and proteomics, allowing readily detectable mutated peptides corresponding to mutated genes. Since sequence database search is the most widely used method for protein identification using Mass spectrometry (MS)-based proteomics technology, a mutant proteome database is required to better approximate the real protein pool to improve disease-associated mutated protein identification. Large-scale whole exome/genome sequencing studies were launched by National Cancer Institute (NCI), Broad Institute, and The Cancer Genome Atlas (TCGA), which provide not only a comprehensive report on the analysis of coding variants in diverse samples cell lines but a invaluable resource for extensive research community. No existing database is available for the collection of mutant protein sequences related to the identified variants in these studies. CMPD is designed to address this issue, serving as a bridge between genomic data and proteomic studies and focusing on protein sequence-altering variations originated from both germline and cancer-associated somatic variations.

Keywords: TCGA, cancer, mutant, proteome

Procedia PDF Downloads 567
1637 Metaheuristics to Solve Tasks Scheduling

Authors: Rachid Ziteuni, Selt Omar

Abstract:

In this paper, we propose a new polynomial metaheuristic elaboration (tabu search) for solving scheduling problems. This method allows us to solve the scheduling problem of n tasks on m identical parallel machines with unavailability periods. This problem is NP-complete in the strong sens and finding an optimal solution appears unlikely. Note that all data in this problem are integer and deterministic. The performance criterion to optimize in this problem which we denote Pm/N-c/summs of (wjCj) is the weighted sum of the end dates of tasks.

Keywords: scheduling, parallel identical machines, unavailability periods, metaheuristic, tabu search

Procedia PDF Downloads 305
1636 Fault Analysis of Ship Power System Comprising of Parallel Generators and Variable Frequency Drive

Authors: Umair Ashraf, Kjetil Uhlen, Sverre Eriksen, Nadeem Jelani

Abstract:

Although advancement in technology has increased the reliability and ease of work in ship power system, but these advancements are also adding complexities. Ever increasing non linear loads, like power electronics (PE) devices effect the stability of the system. Frequent load variations and complex load dynamics are due to the frequency converters and motor drives, these problem are more prominent when system is connected with the weak grid. In the ship power system major consumers are thruster motors for the propulsion. For the control operation of these motors variable frequency drives (VFD) are used, mostly VFDs operate on nominal voltage of the system. Some of the consumers in ship operate on lower voltage than nominal, these consumers got supply through step down transformers. In this paper the vector control scheme is used for the control of both rectifier and inverter, parallel operation of the synchronous generators is also demonstrated. The simulation have been performed with induction motor as load on VFD and parallel RLC load. Fault analysis has been performed first for the system which do not have VFD and then for the system with VFD. Three phase to the ground, single phase to the ground fault were implemented and behavior of the system in both the cases was observed.

Keywords: non-linear load, power electronics, parallel operating generators, pulse width modulation, variable frequency drives, voltage source converters, weak grid

Procedia PDF Downloads 551
1635 Time Bound Parallel Processing of a Disaster Management Alert System Using Random Selection of Target Audience: Bangladesh Context

Authors: Hasan Al Bashar Abul Ulayee, AKM Saifun Nabi, MD Mesbah-Ul-Awal

Abstract:

Alert system for disaster management is common now a day and can play a vital role reducing devastation and saves lives and costs. An alert in right time can save thousands of human life, help to take shelter, manage other assets including live stocks and above all, a right time alert will help to take preparation to face and early recovery of the situation. In a country like Bangladesh where populations is more than 170 million and always facing different types of natural calamities and disasters, an early right time alert is very effective and implementation of alert system is challenging. The challenge comes from the time constraint of alerting the huge number of population. The other method of existing disaster management pre alert is traditional, sequential and non-selective so efficiency is not good enough. This paper describes a way by which alert can be provided to maximum number of people within the short time bound using parallel processing as well as random selection of selective target audience.

Keywords: alert system, Bangladesh, disaster management, parallel processing, SMS

Procedia PDF Downloads 451
1634 Parallel Pipelined Conjugate Gradient Algorithm on Heterogeneous Platforms

Authors: Sergey Kopysov, Nikita Nedozhogin, Leonid Tonkov

Abstract:

The article presents a parallel iterative solver for large sparse linear systems which can be used on a heterogeneous platform. Traditionally, the problem of solving linear systems does not scale well on multi-CPU/multi-GPUs clusters. For example, most of the attempts to implement the classical conjugate gradient method were at best counted in the same amount of time as the problem was enlarged. The paper proposes the pipelined variant of the conjugate gradient method (PCG), a formulation that is potentially better suited for hybrid CPU/GPU computing since it requires only one synchronization point per one iteration instead of two for standard CG. The standard and pipelined CG methods need the vector entries generated by the current GPU and other GPUs for matrix-vector products. So the communication between GPUs becomes a major performance bottleneck on multi GPU cluster. The article presents an approach to minimize the communications between parallel parts of algorithms. Additionally, computation and communication can be overlapped to reduce the impact of data exchange. Using the pipelined version of the CG method with one synchronization point, the possibility of asynchronous calculations and communications, load balancing between the CPU and GPU for solving the large linear systems allows for scalability. The algorithm is implemented with the combined use of technologies: MPI, OpenMP, and CUDA. We show that almost optimum speed up on 8-CPU/2GPU may be reached (relatively to a one GPU execution). The parallelized solver achieves a speedup of up to 5.49 times on 16 NVIDIA Tesla GPUs, as compared to one GPU.

Keywords: conjugate gradient, GPU, parallel programming, pipelined algorithm

Procedia PDF Downloads 136
1633 Mutation Profiling of Paediatric Solid Tumours in a Cohort of South African Patients

Authors: L. Lamola, E. Manolas, A. Krause

Abstract:

Background: The incidence of childhood cancer incidence is increasing gradually in low-middle income countries, such as South Africa. Globally, there is an extensive range of familial- and hereditary-cancer syndromes, where underlying germline variants increase the likelihood of developing cancer in childhood. Next-Generation Sequencing (NGS) technologies have been key in determining the occurrence and genetic contribution of germline variants to paediatric cancer development. We aimed to design and evaluate a candidate gene panel specific to inherited cancer-predisposing genes to provide a comprehensive insight into the contribution of germline variants to childhood cancer. Methods: 32 paediatric patients (aged 0-18 years) diagnosed with a malignant tumour were recruited, and biological samples were obtained. After quality control, DNA was sequenced using an ion Ampliseq 50 candidate gene panel design and Ion Torrent S5 technologies. Sequencing variants were called using Ion Torrent Suite software and were subsequently annotated using Ion Reporter and Ensembl's VEP. High priority variants were manually analysed using tools such as MutationTaster, SIFT-INDEL and VarSome. Putative identified candidates were validated via Sanger Sequencing. Results: The patients studied had a variety of cancers, the most common being nephroblastoma (13), followed by osteosarcoma (4) and astrocytoma (3). We identified 10 pathogenic / likely pathogenic variants in 10 patients, most of which were novel. Conclusions: According to the literature, we expected ~10% of our patient population to harbour pathogenic or likely pathogenic germline variants, however, we reported about 3 times (~30%) more than we expected. Majority of the identified variants are novel; this may be because this is the first study of its kind in an understudied South African population.

Keywords: Africa, genetics, germline-variants, paediatric-cancer

Procedia PDF Downloads 117
1632 The Automatisation of Dictionary-Based Annotation in a Parallel Corpus of Old English

Authors: Ana Elvira Ojanguren Lopez, Javier Martin Arista

Abstract:

The aims of this paper are to present the automatisation procedure adopted in the implementation of a parallel corpus of Old English, as well as, to assess the progress of automatisation with respect to tagging, annotation, and lemmatisation. The corpus consists of an aligned parallel text with word-for-word comparison Old English-English that provides the Old English segment with inflectional form tagging (gloss, lemma, category, and inflection) and lemma annotation (spelling, meaning, inflectional class, paradigm, word-formation and secondary sources). This parallel corpus is intended to fill a gap in the field of Old English, in which no parallel and/or lemmatised corpora are available, while the average amount of corpus annotation is low. With this background, this presentation has two main parts. The first part, which focuses on tagging and annotation, selects the layouts and fields of lexical databases that are relevant for these tasks. Most information used for the annotation of the corpus can be retrieved from the lexical and morphological database Nerthus and the database of secondary sources Freya. These are the sources of linguistic and metalinguistic information that will be used for the annotation of the lemmas of the corpus, including morphological and semantic aspects as well as the references to the secondary sources that deal with the lemmas in question. Although substantially adapted and re-interpreted, the lemmatised part of these databases draws on the standard dictionaries of Old English, including The Student's Dictionary of Anglo-Saxon, An Anglo-Saxon Dictionary, and A Concise Anglo-Saxon Dictionary. The second part of this paper deals with lemmatisation. It presents the lemmatiser Norna, which has been implemented on Filemaker software. It is based on a concordance and an index to the Dictionary of Old English Corpus, which comprises around three thousand texts and three million words. In its present state, the lemmatiser Norna can assign lemma to around 80% of textual forms on an automatic basis, by searching the index and the concordance for prefixes, stems and inflectional endings. The conclusions of this presentation insist on the limits of the automatisation of dictionary-based annotation in a parallel corpus. While the tagging and annotation are largely automatic even at the present stage, the automatisation of alignment is pending for future research. Lemmatisation and morphological tagging are expected to be fully automatic in the near future, once the database of secondary sources Freya and the lemmatiser Norna have been completed.

Keywords: corpus linguistics, historical linguistics, old English, parallel corpus

Procedia PDF Downloads 180
1631 Parallel Multisplitting Methods for DAE’s

Authors: Ahmed Machmoum, Malika El Kyal

Abstract:

We consider iterative parallel multi-splitting method for differential algebraic equations. The main feature of the proposed idea is to use the asynchronous form. We prove that the multi-splitting technique can effectively accelerate the convergent performance of the iterative process. The main characteristic of an asynchronous mode is that the local algorithm not have to wait at predetermined messages to become available. We allow some processors to communicate more frequently than others, and we allow the communication delays tobe substantial and unpredictable. Note that synchronous algorithms in the computer science sense are particular cases of our formulation of asynchronous one.

Keywords: computer, multi-splitting methods, asynchronous mode, differential algebraic systems

Procedia PDF Downloads 527
1630 An Analysis System for Integrating High-Throughput Transcript Abundance Data with Metabolic Pathways in Green Algae

Authors: Han-Qin Zheng, Yi-Fan Chiang-Hsieh, Chia-Hung Chien, Wen-Chi Chang

Abstract:

As the most important non-vascular plants, algae have many research applications, including high species diversity, biofuel sources, adsorption of heavy metals and, following processing, health supplements. With the increasing availability of next-generation sequencing (NGS) data for algae genomes and transcriptomes, an integrated resource for retrieving gene expression data and metabolic pathway is essential for functional analysis and systems biology in algae. However, gene expression profiles and biological pathways are displayed separately in current resources, and making it impossible to search current databases directly to identify the cellular response mechanisms. Therefore, this work develops a novel AlgaePath database to retrieve gene expression profiles efficiently under various conditions in numerous metabolic pathways. AlgaePath, a web-based database, integrates gene information, biological pathways, and next-generation sequencing (NGS) datasets in Chlamydomonasreinhardtii and Neodesmus sp. UTEX 2219-4. Users can identify gene expression profiles and pathway information by using five query pages (i.e. Gene Search, Pathway Search, Differentially Expressed Genes (DEGs) Search, Gene Group Analysis, and Co-Expression Analysis). The gene expression data of 45 and 4 samples can be obtained directly on pathway maps in C. reinhardtii and Neodesmus sp. UTEX 2219-4, respectively. Genes that are differentially expressed between two conditions can be identified in Folds Search. Furthermore, the Gene Group Analysis of AlgaePath includes pathway enrichment analysis, and can easily compare the gene expression profiles of functionally related genes in a map. Finally, Co-Expression Analysis provides co-expressed transcripts of a target gene. The analysis results provide a valuable reference for designing further experiments and elucidating critical mechanisms from high-throughput data. More than an effective interface to clarify the transcript response mechanisms in different metabolic pathways under various conditions, AlgaePath is also a data mining system to identify critical mechanisms based on high-throughput sequencing.

Keywords: next-generation sequencing (NGS), algae, transcriptome, metabolic pathway, co-expression

Procedia PDF Downloads 387
1629 Collision Detection Algorithm Based on Data Parallelism

Authors: Zhen Peng, Baifeng Wu

Abstract:

Modern computing technology enters the era of parallel computing with the trend of sustainable and scalable parallelism. Single Instruction Multiple Data (SIMD) is an important way to go along with the trend. It is able to gather more and more computing ability by increasing the number of processor cores without the need of modifying the program. Meanwhile, in the field of scientific computing and engineering design, many computation intensive applications are facing the challenge of increasingly large amount of data. Data parallel computing will be an important way to further improve the performance of these applications. In this paper, we take the accurate collision detection in building information modeling as an example. We demonstrate a model for constructing a data parallel algorithm. According to the model, a complex object is decomposed into the sets of simple objects; collision detection among complex objects is converted into those among simple objects. The resulting algorithm is a typical SIMD algorithm, and its advantages in parallelism and scalability is unparalleled in respect to the traditional algorithms.

Keywords: data parallelism, collision detection, single instruction multiple data, building information modeling, continuous scalability

Procedia PDF Downloads 268
1628 Binding of Avian Excreta-Derived Enteroccoci to a Streptococcocus mutans: Implications for Avian to Human Transmission

Authors: Richard K. Jolley, Jonathan A. Coffman

Abstract:

Since Enterococci has been implicated in oral disease, we hypothesized the transmission of avian Enterococci to humans via fecal-oral transmission facilitated by adherence to dental plaque. To demonstrate the capability of Enterococci to bind to a dental plaque we filtered avian excreta and incubated the filtrate on a sucrose-induced, Streptococcus mutans biofilm. The biofilm was washed several times with a detergent to remove bacteria binding non-specifically to the biofilm, DNA was isolated from the biofilm, 16S rDNA was amplified, sequenced by Ion Torrent DNA sequencing and analyzed with bioinformatics. Enterococci and other known bacterial pathogens were shown to adhere to the biofilm. Culturing the washed biofilm with Bile Esculin Azide (BEA) agar also confirmed the presence of Enterococci as verified with Sanger sequencing. The results suggest that Enteroccoci in avian excreta has the ability to adhere to human dental plaque and may be a mechanism of entry when humans encounter contaminated aerosols, water or food.

Keywords: Enterococci, avian excreta, dental plaque, NGS

Procedia PDF Downloads 135
1627 Parallel Version of Reinhard’s Color Transfer Algorithm

Authors: Abhishek Bhardwaj, Manish Kumar Bajpai

Abstract:

An image with its content and schema of colors presents an effective mode of information sharing and processing. By changing its color schema different visions and prospect are discovered by the users. This phenomenon of color transfer is being used by Social media and other channel of entertainment. Reinhard et al’s algorithm was the first one to solve this problem of color transfer. In this paper, we make this algorithm efficient by introducing domain parallelism among different processors. We also comment on the factors that affect the speedup of this problem. In the end by analyzing the experimental data we claim to propose a novel and efficient parallel Reinhard’s algorithm.

Keywords: Reinhard et al’s algorithm, color transferring, parallelism, speedup

Procedia PDF Downloads 589
1626 Parallel Processing in near Absence of Attention: A Study Using Dual-Task Paradigm

Authors: Aarushi Agarwal, Tara Singh, I.L Singh, Anju Lata Singh, Trayambak Tiwari

Abstract:

Simple discrimination in near absence of attention has been widely observed. Dual-task studies with natural scenes studies have been claimed as being preattentive in nature that facilitated categorization simultaneously with the attentional demanding task. So in this study, multiple images at the periphery are presented, initiating parallel processing in near absence of attention. For the central demanding task rotated letters were presented in both conditions, while in periphery natural and animal images were presented. To understand the breakpoint of ability to perform in near absence of attention one, two and three peripheral images were presented simultaneously with central task and subjects had to respond when all belong to the same category. Individual participant performance did not show a significant difference in both conditions central and peripheral task when the single peripheral image was shown. In case of two images high-level parallel processing could take place with little attentional resources. The eye tracking results supports the evidence as no major saccade was made in a large number of trials. Three image presentations proved to be a breaking point of the capacities to perform outside attentional assistance as participants showed a confused eye gaze pattern which failed to make the natural and animal image discriminations. Thus, we can conclude attention and awareness being independent mechanisms having limited capacities.

Keywords: attention, dual task pardigm, parallel processing, break point, saccade

Procedia PDF Downloads 197
1625 The Design of Multiple Detection Parallel Combined Spread Spectrum Communication System

Authors: Lixin Tian, Wei Xue

Abstract:

Many jobs in society go underground, such as mine mining, tunnel construction and subways, which are vital to the development of society. Once accidents occur in these places, the interruption of traditional wired communication is not conducive to the development of rescue work. In order to realize the positioning, early warning and command functions of underground personnel and improve rescue efficiency, it is necessary to develop and design an emergency ground communication system. It is easy to be subjected to narrowband interference when performing conventional underground communication. Spreading communication can be used for this problem. However, general spread spectrum methods such as direct spread communication are inefficient, so it is proposed to use parallel combined spread spectrum (PCSS) communication to improve efficiency. The PCSS communication not only has the anti-interference ability and the good concealment of the traditional spread spectrum system, but also has a relatively high frequency band utilization rate and a strong information transmission capability. So, this technology has been widely used in practice. This paper presents a PCSS communication model-multiple detection parallel combined spread spectrum (MDPCSS) communication system. In this paper, the principle of MDPCSS communication system is described, that is, the sequence at the transmitting end is processed in blocks and cyclically shifted to facilitate multiple detection at the receiving end. The block diagrams of the transmitter and receiver of the MDPCSS communication system are introduced. At the same time, the calculation formula of the system bit error rate (BER) is introduced, and the simulation and analysis of the BER of the system are completed. By comparing with the common parallel PCSS communication, we can draw a conclusion that it is indeed possible to reduce the BER and improve the system performance. Furthermore, the influence of different pseudo-code lengths selected on the system BER is simulated and analyzed, and the conclusion is that the larger the pseudo-code length is, the smaller the system error rate is.

Keywords: cyclic shift, multiple detection, parallel combined spread spectrum, PN code

Procedia PDF Downloads 115
1624 Genetic Diversity and Discovery of Unique SNPs in Five Country Cultivars of Sesamum indicum by Next-Generation Sequencing

Authors: Nam-Kuk Kim, Jin Kim, Soomin Park, Changhee Lee, Mijin Chu, Seong-Hun Lee

Abstract:

In this study, we conducted whole genome re-sequencing of 10 cultivars originated from five countries including Korea, China, India, Pakistan and Ethiopia with Sesamum indicum (Zhongzho No. 13) genome as a reference. Almost 80% of the whole genome sequences of the reference genome could be covered by sequenced reads. Numerous SNP and InDel were detected by bioinformatic analysis. Among these variants, 266,051 SNPs were identified as unique to countries. Pakistan and Ethiopia had high densities of SNPs compared to other countries. Three main clusters (cluster 1: Korea, cluster 2: Pakistan and India, cluster 3: Ethiopia and China) were recovered by neighbor-joining analysis using all variants. Interestingly, some variants were detected in DGAT1 (diacylglycerol O-acyltransferase 1) and FADS (fatty acid desaturase) genes, which are known to be related with fatty acid synthesis and metabolism. These results can provide useful information to understand the regional characteristics and develop DNA markers for origin discrimination of sesame.

Keywords: Sesamum indicum, NGS, SNP, DNA marker

Procedia PDF Downloads 305
1623 Metagenomics-Based Molecular Epidemiology of Viral Diseases

Authors: Vyacheslav Furtak, Merja Roivainen, Olga Mirochnichenko, Majid Laassri, Bella Bidzhieva, Tatiana Zagorodnyaya, Vladimir Chizhikov, Konstantin Chumakov

Abstract:

Molecular epidemiology and environmental surveillance are parts of a rational strategy to control infectious diseases. They have been widely used in the worldwide campaign to eradicate poliomyelitis, which otherwise would be complicated by the inability to rapidly respond to outbreaks and determine sources of the infection. The conventional scheme involves isolation of viruses from patients and the environment, followed by their identification by nucleotide sequences analysis to determine phylogenetic relationships. This is a tedious and time-consuming process that yields definitive results when it may be too late to implement countermeasures. Because of the difficulty of high-throughput full-genome sequencing, most such studies are conducted by sequencing only capsid genes or their parts. Therefore the important information about the contribution of other parts of the genome and inter- and intra-species recombination to viral evolution is not captured. Here we propose a new approach based on the rapid concentration of sewage samples with tangential flow filtration followed by deep sequencing and reconstruction of nucleotide sequences of viruses present in the samples. The entire nucleic acids content of each sample is sequenced, thus preserving in digital format the complete spectrum of viruses. A set of rapid algorithms was developed to separate deep sequence reads into discrete populations corresponding to each virus and assemble them into full-length consensus contigs, as well as to generate a complete profile of sequence heterogeneities in each of them. This provides an effective approach to study molecular epidemiology and evolution of natural viral populations.

Keywords: poliovirus, eradication, environmental surveillance, laboratory diagnosis

Procedia PDF Downloads 254
1622 An Unbiased Profiling of Immune Repertoire via Sequencing and Analyzing T-Cell Receptor Genes

Authors: Yi-Lin Chen, Sheng-Jou Hung, Tsunglin Liu

Abstract:

Adaptive immune system recognizes a wide range of antigens via expressing a large number of structurally distinct T cell and B cell receptor genes. The distinct receptor genes arise from complex rearrangements called V(D)J recombination, and constitute the immune repertoire. A common method of profiling immune repertoire is via amplifying recombined receptor genes using multiple primers and high-throughput sequencing. This multiplex-PCR approach is efficient; however, the resulting repertoire can be distorted because of primer bias. To eliminate primer bias, 5’ RACE is an alternative amplification approach. However, the application of RACE approach is limited by its low efficiency (i.e., the majority of data are non-regular receptor sequences, e.g., containing intronic segments) and lack of the convenient tool for analysis. We propose a computational tool that can correctly identify non-regular receptor sequences in RACE data via aligning receptor sequences against the whole gene instead of only the exon regions as done in all other tools. Using our tool, the remaining regular data allow for an accurate profiling of immune repertoire. In addition, a RACE approach is improved to yield a higher fraction of regular T-cell receptor sequences. Finally, we quantify the degree of primer bias of a multiplex-PCR approach via comparing it to the RACE approach. The results reveal significant differences in frequency of VJ combination by the two approaches. Together, we provide a new experimental and computation pipeline for an unbiased profiling of immune repertoire. As immune repertoire profiling has many applications, e.g., tracing bacterial and viral infection, detection of T cell lymphoma and minimal residual disease, monitoring cancer immunotherapy, etc., our work should benefit scientists who are interested in the applications.

Keywords: immune repertoire, T-cell receptor, 5' RACE, high-throughput sequencing, sequence alignment

Procedia PDF Downloads 172
1621 Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems

Authors: Mais Haj Qasem, Maen M. Al Assaf, Ali Rodan

Abstract:

Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.

Keywords: hybrid storage system, data mining, recurrent neural network, support vector machine

Procedia PDF Downloads 285
1620 Characterization of the Blood Microbiome in Rheumatoid Arthritis Patients Compared to Healthy Control Subjects Using V4 Region 16S rRNA Sequencing

Authors: D. Hammad, D. P. Tonge

Abstract:

Rheumatoid arthritis (RA) is a disabling and common autoimmune disease during which the body's immune system attacks healthy tissues. This results in complicated and long-lasting actions being carried out by the immune system, which typically only occurs when the immune system encounters a foreign object. In the case of RA, the disease affects millions of people and causes joint inflammation, ultimately leading to the destruction of cartilage and bone. Interestingly, the disease mechanism still remains unclear. It is likely that RA occurs as a result of a complex interplay of genetic and environmental factors including an imbalance in the microorganism population inside our body. The human microbiome or microbiota is an extensive community of microorganisms in and on the bodies of animals, which comprises bacteria, fungi, viruses, and protozoa. Recently, the development of molecular techniques to characterize entire bacterial communities has renewed interest in the involvement of the microbiome in the development and progression of RA. We believe that an imbalance in some of the specific bacterial species in the gut, mouth and other sites may lead to atopobiosis; the translocation of these organisms into the blood, and that this may lead to changes in immune system status. The aim of this study was, therefore, to characterize the microbiome of RA serum samples in comparison to healthy control subjects using 16S rRNA gene amplification and sequencing. Serum samples were obtained from healthy control volunteers and from patients with RA both prior to, and following treatment. The bacterial community present in each sample was identified utilizing V4 region 16S rRNA amplification and sequencing. Bacterial identification, to the lowest taxonomic rank, was performed using a range of bioinformatics tools. Significantly, the proportions of the Lachnospiraceae, Ruminococcaceae, and Halmonadaceae families were significantly increased in the serum of RA patients compared with healthy control serum. Furthermore, the abundance of Bacteroides and Lachnospiraceae nk4a136_group, Lachnospiraceae_UGC-001, RuminococcaceaeUCG-014, Rumnococcus-1, and Shewanella was also raised in the serum of RA patients relative to healthy control serum. These data support the notion of a blood microbiome and reveal RA-associated changes that may have significant implications for biomarker development and may present much-needed opportunities for novel therapeutic development.

Keywords: blood microbiome, gut and oral bacteria, Rheumatoid arthritis, 16S rRNA gene sequencing

Procedia PDF Downloads 107
1619 Parallel Hybrid Honeypot and IDS Architecture to Detect Network Attacks

Authors: Hafiz Gulfam Ahmad, Chuangdong Li, Zeeshan Ahmad

Abstract:

In this paper, we proposed a parallel IDS and honeypot based approach to detect and analyze the unknown and known attack taxonomy for improving the IDS performance and protecting the network from intruders. The main theme of our approach is to record and analyze the intruder activities by using both the low and high interaction honeypots. Our architecture aims to achieve the required goals by combing signature based IDS, honeypots and generate the new signatures. The paper describes the basic component, design and implementation of this approach and also demonstrates the effectiveness of this approach reducing the probability of network attacks.

Keywords: network security, intrusion detection, honeypot, snort, nmap

Procedia PDF Downloads 532
1618 Optimization of Production Scheduling through the Lean and Simulation Integration in Automotive Company

Authors: Guilherme Gorgulho, Carlos Roberto Camello Lima

Abstract:

Due to the competitive market in which companies are currently engaged, the constant changes require companies to react quickly regarding the variability of demand and process. The changes are caused by customers, or by demand fluctuations or variations of products, or the need to serve customers within agreed delivery taking into account the continuous search for quality and competitive prices in products. These changes end up influencing directly or indirectly the activities of the Planning and Production Control (PPC), which does business in strategic, tactical and operational levels of production systems. One area of concern for organizations is in the short term (operational level), because this planning stage any error or divergence will cause waste and impact on the delivery of products on time to customers. Thus, this study aims to optimize the efficiency of production scheduling, using different sequencing strategies in an automotive company. Seeking to aim the proposed objective, we used the computer simulation in conjunction with lean manufacturing to build and validate the current model, and subsequently the creation of future scenarios.

Keywords: computational simulation, lean manufacturing, production scheduling, sequencing strategies

Procedia PDF Downloads 252
1617 Effects of GRF on CMJ in Different Wooden Surface Systems

Authors: Yi-cheng Chen, Ming-jum Guo, Yang-ru Chen

Abstract:

Background and Objective: For safety and fair during basketball competition, FIBA proposes the definite level of physical functions in wooden surface system (WSS). There are existing various between different systems in indoor-stadium, so the aim of this study want to know how many effects in different WSS, especially for effects of ground reaction force(GRF) when player jumped. Materials and Methods: 12 participants acted counter-movement jump (CMJ) on 7 different surfaces, include 6 WSSs by 3 types rubber shock absorber pad (SAP) on cross or parallel fixed, and 1 rigid ground. GRFs of takeoff and landing had been recorded from an AMTI force platform when all participants acted vertical CMJs by counter-balance design. All data were analyzed using the one-way ANOVA to evaluate whether the test variable differed significantly between surfaces. The significance level was set at α=0.05. Results: There were non-significance in GRF between surfaces when participants taken off. For GRF of landing, we found WSS with cross fixed SAP are harder than parallel fixed. Although there were also non-significance when participant was landing on cross or parallel fixed surfaces, but there have test variable differed significantly between WSS with parallel fixed to rigid ground. In the study, landing to WSS with the hardest SAP, the GRF also have test variable differed significantly to other WSS. Conclusion: Although official basketball competition is in the WSS certificated by FIBA, there are also exist the various in GRF under takeoff or landing, any player must to warm-up before game starting. Especially, there is unsafe situation when play basketball on uncertificated WSS.

Keywords: wooden surface system, counter-movement jump, ground reaction force, shock absorber pad

Procedia PDF Downloads 419
1616 Characterization of Genus Candida Yeasts Isolated from Oral Microbiota of Brazilian Schoolchildren with Different Caries Experience

Authors: D. S. V. Barbieri, R. R. Gomes, G. D. Santos, P. F. Herkert, M. Moreira, E. S. Trindade, V. A. Vicente

Abstract:

The importance of yeast infections has increased in recent decades. The monitoring of Candida yeasts has been relevant in the study of groups and populations. This research evaluated 31 Candida spp. isolates from oral microbiota of 12 Brazilian schoolchildren coinfected with Streptococcus mutans. The isolates were evaluated for their ability to form biofilm in vitro and molecularly characterized based on the sequencing of intergenic spacer regions ITS1-5,8S-ITS2 and variable domains of the large subunit (D1/D2) regions of the rDNA, as well as ABC system genotyping. The sequencing confirmed 26 lineages of Candida albicans, three Candida tropicalis, one Candida guillhermondii and one Candida glabrata. Genetic variability and differences on in biofilm formation were observed among Candida yeasts lineages. At least one Candida strain from each caries activity child was C.albicans genotype A or Candida non-albicans. C. tropicalis was associated with highest cavities rates. These results indicate that the presence of C. albicans genotype A or multi-colonization by non albicans species seem to be associates to the potentialization of caries risk.

Keywords: biofilm, Candida albicans, oral microbiota, caries

Procedia PDF Downloads 490
1615 The Risk of Ground Movements After Digging Two Parallel Vertical Tunnel in Urban

Authors: Djelloul Chafia, Demagh Rafik, Kareche Toufik

Abstract:

Human activities, made without precautions, accelerate the degradation of the soil structure and reduces its resistance. Operations, such as tunnel construction may exercise an influence more or less permanent on the grounds which surrounded them, these structures alter soil it is necessary to predict their impacts by suitable measures. This research is a numerical analysis that deals the risks and effects due to the weakening of the soil after digging two parallel vertical circular tunnels in urban areas, and suggests forecasting techniques based essentially on the organization of underground space. The simulations are performed using the finite-difference code FLAC in a two-dimensional case and with an elasto-plastic behavior of the soil.

Keywords: sol, weakening, degradation, prevention, tunnel

Procedia PDF Downloads 539