Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
详情
字幕
Kernel Optimization for AI and Beyond: Unlocking the Power of Nsight Compute
, Sr. System Software Engineer, NVIDIA
, Senior System Software Engineer, NVIDIA
Learn how to unlock the full potential of NVIDIA GPUs with the powerful profiling and analysis capabilities of Nsight Compute. AI workloads are rapidly increasing the demand for GPU computing, and ensuring that they efficiently utilize all available GPU resources is essential. Nsight Compute is the most powerful tool for understanding kernel execution behavior and performance. Learn how to configure and launch profiles customized for your needs, including advice on profiling accelerated Python applications, AI frameworks like PyTorch, and optimizing Tensor Core utilization essential to modern AI performance. Learn how to debug your kernel and use the expert system built into Nsight Compute, known as “Guided Analysis,” that automatically detects common issues and directs you to the most relevant performance data all the way down to the source code level. Prerequisite(s):
Familiarity with CUDA.
活动: GTC 25
日期: March 2025
行业: 所有行业
NVIDIA 技术: CUDA,NSight Comute
话题: Development and Optimization - Profilers / Debuggers / Code Analysis