We are going through the computation from single core to multicore architecture in parallel programming. Graphics Processor Units (GPUs) have recently emerged as outstanding platforms for data parallel applications with regular data access patterns. However, it is still challenging to optimize computations with irregular data access patterns like sparse matrix-vector multiplication (SPMV). SPMV is one of the most important computational kernels in engineering practice and scientific computation. Various data formats to store the sparse matrix have been implemented on GPUs to maximize the performance. In this paper, we propose and evaluate a new implementation of SPMV on GPU based on QCSR storage format which combines the quadtree storage format and CSR format. We also outline some optimization strategies to improve performance. In comparison with previously published implementation, it achieves higher overall performance than BCSR format. The results show that it achieves 1.15 speedup averagely than BCSR format.
Zhang, Jilin; Liu, Enyi; Wan, Jian; Ren, Yongjian; Yue, Miao; and Wang, Jue
"Implementing Sparse Matrix-Vector Multiplication with QCSR on GPU,"
Applied Mathematics & Information Sciences: Vol. 07
, Article 6.
Available at: https://dc.naturalspublishing.com/amis/vol07/iss2/6