Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
详情
字幕
Advanced Performance Optimization in CUDA
, Developer Technology Engineer, NVIDIA
This talk is the second part in a series of Core Performance optimization techniques. It is intended for developers who are already familiar with the basics covered in the first part. We'll teach advanced techniques, and how to use some of the new features introduced in Hoppper. The topics covered will include asynchronous copies and barriers, CUDA clusters, L2 persistency, CUDA graphs, memory pools, dynamic parallelism 2.0.