Video Player is loading.
Current Time 0:00
Duration 1:27:50
Loaded: 0%
Stream Type LIVE
Remaining Time 1:27:50
 
1x
    • Chapters
    • descriptions off, selected
    • default, selected

    CUDA Techniques to Maximize Memory Bandwidth and Hide Latency

    , Sr. Developer Technology, NVIDIA
    , Developer Technology Engineer, NVIDIA
    高度评价
    Do you want to write speed-of-light CUDA kernels? When a hand-written kernel is the right approach for the job, you want to use the best possible practices to maximize performance on the GPU. In this tutorial, we will showcase techniques to maximize memory bandwidth and hide memory latencies in our CUDA kernels, including how to efficiently use shared memory, distributed shared memory and asynchronous data copies.

    活动: GTC 25
    日期: March 2025
    行业: 所有行业
    NVIDIA 技术: CUDA,CUDA-X,NSight Comute,NSight Systems
    话题: Development and Optimization - Performance Optimization
    级别: 技术 - 高级
    语言: 英语
    所在地: