Performance modeling and optimization of sparse matrix-vector multiplication on NVIDIA CUDA platform