%0 Journal Article
	%A Weizhi Xu and  Zhiyong Liu and  Dongrui Fan and  Shuai Jiao and  Xiaochun Ye and  Fenglong Song and  Chenggang Yan
	%D 2012
	%J International Journal of Computer and Information Engineering
	%B World Academy of Science, Engineering and Technology
	%I Open Science Index 61, 2012
	%T Accelerating Sparse Matrix Vector Multiplication on Many-Core GPUs
	%U https://publications.waset.org/pdf/2362
	%V 61
	%X Many-core GPUs provide high computing ability and
substantial bandwidth; however, optimizing irregular applications
like SpMV on GPUs becomes a difficult but meaningful task. In this
paper, we propose a novel method to improve the performance of
SpMV on GPUs. A new storage format called HYB-R is proposed to
exploit GPU architecture more efficiently. The COO portion of the
matrix is partitioned recursively into a ELL portion and a COO
portion in the process of creating HYB-R format to ensure that there
are as many non-zeros as possible in ELL format. The method of
partitioning the matrix is an important problem for HYB-R kernel, so
we also try to tune the parameters to partition the matrix for higher
performance. Experimental results show that our method can get
better performance than the fastest kernel (HYB) in NVIDIA-s
SpMV library with as high as 17% speedup.
	%P 11 - 18