Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
详情
字幕
Profilers, Python, and Performance: Nsight Tools for Optimizing Modern CUDA Workloads
, Sr. System Software Engineer, NVIDIA
, Senior Systems Software Engineer, NVIDIA
Learn how NVIDIA's profiling tools, Nsight Systems and Nsight Compute, can help you accelerate modern compute workloads. In the first half of this lab, you'll get hands-on experience with optimizing CUDA applications using Nsight Systems, which is a system-wide performance analysis tool that helps you optimize and scale a CUDA application irrespective of where it runs, whether on a simple workstation, an embedded device, or a cluster in the cloud. In the second half, we'll switch focus to individual workloads on the GPU and explore how to use Nsight Compute to dive deep into CUDA details for applications written in Python, the language of AI. You'll learn how to measure and optimize hardware pipeline utilization, memory accesses, and more, from the source to the assembly level. Prerequisite(s):
Attendees should have basic knowledge in Python and CUDA programming. Experience with using profiling tools and/or computer vision frameworks is a plus, but not a requirement to follow the course.