Posts by Collection

portfolio

publications

AMF-CSR: Adaptive Multi-Row Folding of CSR for SpMV on GPU

Published in 2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS), 2021

This paper proposes a new GPU-based SpMV algorithm AMF-CSR.

Recommended citation: Jianhua Gao, Weixing Ji, Senhao Shao, Yizhuo Wang, Feng Shi. (2021). "AMF-CSR: Adaptive Multi-Row Folding of CSR for SpMV on GPU." 2021 IEEE 27th International Conference on Parallel and Distributed Systems. 418-425. https://ieeexplore.ieee.org/document/9763779

Towards Optimal Fast Matrix Multiplication on CPU-GPU Platforms

Published in International Conference on Parallel and Distributed Computing: Applications and Technologies (PDCAT), 2021

This paper proposes a CPU-GPU heterogenous implementation for the Winograd algorithm.

Recommended citation: Senhao Shao, Yizhuo Wang, Weixing Ji, Jianhua Gao. (2022). "Towards Optimal Fast Matrix Multiplication on CPU-GPU Platforms." International Conference on Parallel and Distributed Computing: Applications and Technologies (PDCAT). 223–236. https://link.springer.com/chapter/10.1007/978-3-030-96772-7_21

TaiChi: A Hybrid Compression Format for Binary Sparse Matrix-Vector Multiplication on GPU

Published in IEEE Transactions on Parallel and Distributed Systems, 2022

This paper proposes a new compression format for binary sparse matrix.

Recommended citation: Jianhua Gao, Weixing Ji, Zhaonian Tan, Yizhuo Wang, Feng Shi. (2022). "TaiChi: A Hybrid Compression Format for Binary Sparse Matrix-Vector Multiplication on GPU." IEEE Transactions on Parallel and Distributed Systems. 33(12):3732-3745. https://ieeexplore.ieee.org/document/9763312

talks

基于数据分布特征的稀疏矩阵向量乘优化研究

Published:

本报告所汇报的内容主要覆盖课题组两个已经发表的工作和一个正在进行的工作。两个已发表的工作分别是:“TaiChi: A Hybrid Compression Format for Binary Sparse Matrix-Vector Multiplication on GPU”和“AMF-CSR: Adaptive Multi-Row Folding of CSR for SpMV on GPU”。第一个工作于2022年4月在国际高水平期刊TPDS上出版,第二个工作被国际高水平会议ICPADS2021收录。第三个工作“Revisiting Thread Configuration of SpMV Kernels on GPU: A Machine Learning Based Approach”是课题组目前正在进行的工作。本报告所汇报的内容致力于提升面向GPU的SpMV的计算效率,针对高数据传输开销、低缓存命中率和不均衡负载、单一线程配置方案的局限性这三个问题提出了相应的解决方案。

teaching

Parallel Progamming: Principle and Practice

Undergraduate course, Beijing Normal University, School of Artificial Intelligence, 2023

This course provides students with a comprehensive understanding of the principles, techniques, and best practices involved in developing parallel programs. Parallel programming is a fundamental aspect of high-performance computing and plays a crucial role in harnessing the power of modern computer architectures.