Using the PGAS Programming Paradigm for Biological Sequence Alignment on a Chip Multi-Threading Architecture

M. Bakhouya; S. A. Bahra; T. El-Ghazawi

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33122

Using the PGAS Programming Paradigm for Biological Sequence Alignment on a Chip Multi-Threading Architecture

Authors: M. Bakhouya, S. A. Bahra, T. El-Ghazawi

Abstract:

The Partitioned Global Address Space (PGAS) programming paradigm offers ease-of-use in expressing parallelism through a global shared address space while emphasizing performance by providing locality awareness through the partitioning of this address space. Therefore, the interest in PGAS programming languages is growing and many new languages have emerged and are becoming ubiquitously available on nearly all modern parallel architectures. Recently, new parallel machines with multiple cores are designed for targeting high performance applications. Most of the efforts have gone into benchmarking but there are a few examples of real high performance applications running on multicore machines. In this paper, we present and evaluate a parallelization technique for implementing a local DNA sequence alignment algorithm using a PGAS based language, UPC (Unified Parallel C) on a chip multithreading architecture, the UltraSPARC T1.

Keywords: Partitioned Global Address Space, Unified Parallel C, Multicore machines, Multi-threading Architecture, Sequence alignment.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1078994

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1395

References:

[1] El-Ghazawi T., Carlson W., Sterling T, Yelick K.: UPC: Distributed Shared Memory Programming. Book, John Wiley and Sons Inc., NewYork. ISBN: 0-471-22048-5, 2005.
[2] Yap T. K., FriederO., Martino R. L.: Parallel Computation in Biological Sequence Analysis. IEEE Transactions on Parallel and Distributed Systems, Vol. 9, N 3, pp. 283-294, 1998.
[3] Needleman, S. B., Wunsch C. D: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, Vol. 47, pp. 443-453, 1970.
[4] Smith W., Waterman M.: Identification of Common Molecular Subsequences. Journal of Molecular Biology Vol. 147, pp. 195-197, 1981.
[5] Gaber J.: Complexity Measure Approach for Partitioned Shared Memory Model, Application to UPC. Research report RR-10-04. Universite de Technologie de Belfort-Montbeliard, 2004.
[6] Lu M., Lin L.: Parallel algorithms for the Longest Common Subsequence Problem. IEEE Transaction on Parallel and Distributed System, Vol.5, pp.835-848, 1994.
[7] Valiant L.G.: A bridging model for parallel computation. Communication of the ACM, Vol. 33, N 8, pp. 103-111, 1990.
[8] http://www.sun.com/servers/coolthreads/t1000/benchmarks.jsp
[9] Garcia T., Myoupo J-F., Seme D.: A Coarse-Grained Multi-computer algorithm for the longest common subsequence problem. 11-th Euromicro Conference on Parallel Distributed and Network based Processing, 2003.
[10] Alves C. E. R., Cceres E. N., Dehne F.: Parallel Dynamic Programming for Solving the String Editing Problem on a CGM/BSP. In proceeding of ACM SPAA-02, pp. 275-281, 2002.
[11] Alves C. E. R. Cceres E. N., Dehne F. , Song S. W.: A Parallel Wavefront Algorithm for Efficient Biological Sequence Comparison. International Conference on Computational Science and its Applications, Montreal, Canada, May 18-21, Lecture Notes in Computer Science, V. 2668, pp. 249-258, 2003.
[12] Nicholas P. P.: Searching Biological Sequence Databases Using Distributed Adaptive Computing. Master thesis, Master of Science in Computer Engineering, Virginia Polytechnic Institute and State University, 2003, available at http://scholar.lib.vt.edu/theses/
[13] Chen Y., Wan A., Liu W.: A fast Parallel Algorithm for Finding the Longest Common Sequence of Multiple Bio-sequences. Symposium of Computations in Bioinformatics and Bioscience (SCBB06). In conjunction with the International Multi-Symposiums on Computer and Computational Sciences 2006 (IMSCCS06), 2006.
[14] Bader D.A., Madduri K.,: A Graph-Theoretic Analysis of the Human Protein-Interaction Network Using Multicore Parallel Algorithms. Sixth IEEE International Workshop on High Performance Computational Biology (HiCOMB-07), 2007.
[15] Bader D. A., Kanade V., Madduri K.: SWARM, A Parallel Programming Framework for Multicore Processors. First Workshop on Multithreaded Architectures and Applications (MTAAP-07), 2007.
[16] Voss G., Schrder A., Mller-Wittig W. Schmidt B.: Biological Sequence Alignment on Graphics Processing Units. Available at http://www.ntu.edu.sg/home/asbschmidt/paper/BioGPU.pdf
[17] Kayi A., Yao Y., El-Ghazawi T., Newby G.: Experimental Evaluation ofEmerging Multi-core Architectures, Workshop on Performance Modeling, Evaluation, and Optimisation of Ubiquitous Computing and Networked Systems (PMEO07) IPDPS07 Proceedings, pp. 1-6, 2007.
[18] Chen W-Y., Bonachea D., Duell J., Husbands P., Iancu C., Yelick K.: A Perfor- mance Analysis of the Berkley UPC Compiler. In Annual International Conference on Supercomputing (ICS), 2003.
[19] Cantonnet F., El Ghazawi T., Lorenz P., Gaber J.: Fast Address Translation Tech- niques for Distributed Shared Memory Compilers. International Parallel and Dis- tributed Processing Symposium IPDPS06, 2006.
[20] Bakhouya M., Gaber J., El-Ghazawi T.: Towards a Complexity Model for Design and Analysis of PGAS-Based Algorithms, HPCC 2007 Proceedings, LNCS 4782 Springer, ISBN 978-3-540-75443-5, pp.672-682, 2007.