Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
    • Chapters
    • descriptions off, selected
    • subtitles off, selected
      • Quality

      CUDA Techniques to Maximize Memory Bandwidth and Hide Latency

      , Sr. Developer Technology, NVIDIA
      , Developer Technology Engineer, NVIDIA
      高度评价
      Do you want to write speed-of-light CUDA kernels? When a hand-written kernel is the right approach for the job, you want to use the best possible practices to maximize performance on the GPU. In this tutorial, we will showcase techniques to maximize memory bandwidth and hide memory latencies in our CUDA kernels, including how to efficiently use shared memory, distributed shared memory and asynchronous data copies.

      活动: GTC 25
      日期: March 2025
      行业: 所有行业
      NVIDIA 技术: CUDA,CUDA-X,NSight Comute,NSight Systems
      话题: Development and Optimization - Performance Optimization
      级别: 技术 - 高级
      语言: 英语
      所在地: