Optimal All-to-All Personalized Communication in All-Port Tori
Authors: Liu Gang, Gu Nai-jie, Bi Kun, Tu Kun, Dong Wan-li
Abstract:
All-to-all personalized communication, also known as complete exchange, is one of the most dense communication patterns in parallel computing. In this paper, we propose new indirect algorithms for complete exchange on all-port ring and torus. The new algorithms fully utilize all communication links and transmit messages along shortest paths to completely achieve the theoretical lower bounds on message transmission, which have not be achieved among other existing indirect algorithms. For 2D r × c ( r % c ) all-port torus, the algorithm has time complexities of optimal transmission cost and O(c) message startup cost. In addition, the proposed algorithms accommodate non-power-of-two tori where the number of nodes in each dimension needs not be power-of-two or square. Finally, the algorithms are conceptually simple and symmetrical for every message and every node so that they can be easily implemented and achieve the optimum in practice.
Keywords: Complete exchange, collective communication, all-to-all personalized communication, parallel computing, wormhole routing, torus.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1084212
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1512References:
[1] P.K. McKinley, Y.J. Tsai, and D.F. Robinson, "A Survey of Collective Communication in Wormhole-Routed Massively Parallel Computers," Technical Report, MSU-CPS-94-35, Michigan State University, June 1994.
[2] S. Sur, H.W. Jin, and D.K. Panda, "Efficient and scalable all-to-all personalized exchange for infiniband-based clusters," Proc. 2004 Int-l Conf. on Parallel Processing (ICPP-04), pp. 275-282, Aug. 2004.
[3] C.C. Lam, C.H. Huang, and P. Sadayappan, "Optimal Algorithms for All-to-All Personalized Communication on Rings and Two Dimensional Tori," Journal of Parallel and Distributed Computing, No. 43, pp. 3-13, 1997.
[4] Y.C. Tseng and S.K.S. Gupta, "All-to-All Personalized Communication in a Wormhole-Routed Torus", IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 5, pp. 498-505, May. 1996.
[5] Y.C. Tseng, T.H. Lin, S.K.S. Gupta and D.K. Panda, "Bandwidth-Optimal Complete Exchange on Wormhole- Routed 2D/3D torus Networks: A Diagonal-Propagation Approach," IEEE Trans. Parallel and Distributed Systems, vol. 8, no. 4, pp. 380-396, Apr. 1997.
[6] Y.J. Suh and S. Yalamanchili, "All-to-All Communication with Minimum Start-Up Costs in 2D/3D Tori and Meshes," IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 5, pp. 442-458, May 1998.
[7] Y.C. Tseng, S.Y. Ni, and J.P. Sheu, "Toward Optimal Complete Exchange on Wormhole-Routed Tori," IEEE Trans. Computers, vol. 48, no. 10, pp. 1065-1082, Oct. 1999.
[8] GU Naijie, "Efficient Indirect All-to-All Personalized Communication on Rings and 2-D Tori," Journal of Computer Science and Technology, vol. 16, no. 5, pp. 480-483, Sept. 2001.
[9] N.S. Sundar, D.N. Jayasimha, D.K. Panda, and P. Sadayappan, "Hybrid Algorithms for Complete Exchange in 2D Meshes," IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 12, pp. 1201-1218, Dec. 2001.
[10] Y.J. Suh and K.G. Shin, "All-to-All Personalized Communication in Multidimensional Torus and Mesh Networks," IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 1, pp. 38-59, Jan. 2001.
[11] D.S. Scott, "Efficient All-to-All Communication Patterns in Hypercube and Mesh Topologies," Proc. Sixth Conf. Distributed Memory Concurrent Computers, pp. 398- 403, Portland, OR, April 1991.
[12] S.C. Chau and A.W.C. Fu, "An optical multistage interco- nnection network for optimal all-to-all personalized exchange," Proc. fourth Int-l Conf. Parallel and Distributed Computing, Applications and Technologies (PDCAT'2003), pp. 292-295, 27-29 Aug. 2003.
[13] Y.Y. Yang and J.C. Wang, "Near-Optimal All-to-All Broadcast in Multidimensional All-Port Meshes and Tori," IEEE Trans. Parallel and Distributed Systems, vol. 13, no. 2, pp. 128-141!Feb. 2002.
[14] J.P. Jung and I. Sakho, "A methodology for devising optimal all-port all-to-all broadcast algorithms in 2-dimensional tori," 28th Annual IEEE Int-l Conf. Local Computer Networks (LCN'03), pp. 558-566, Oct. 2003.