Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
详情
字幕
CUDA Techniques to Maximize Memory Bandwidth and Hide Latency
, Sr. Developer Technology, NVIDIA
, Developer Technology Engineer, NVIDIA
高度评价
Do you want to write speed-of-light CUDA kernels? When a hand-written kernel is the right approach for the job, you want to use the best possible practices to maximize performance on the GPU. In this tutorial, we will showcase techniques to maximize memory bandwidth and hide memory latencies in our CUDA kernels, including how to efficiently use shared memory, distributed shared memory and asynchronous data copies.
活动: GTC 25
日期: March 2025
行业: 所有行业
NVIDIA 技术: CUDA,CUDA-X,NSight Comute,NSight Systems
话题: Development and Optimization - Performance Optimization