Name: NCCL: High-Speed Inter-GPU Communication for Large-Scale Training S31880 | GTC Digital April 2021 | NVIDIA On-Demand
Uploaded: 2021-04-12T00:00:00Z
Duration: 2465 s
Description: Inter-GPU communication is central in training DL networks on multiple GPUs

Video Player is loading.

Current Time 0:00

Duration 0:00

Loaded: 0%

Stream Type LIVE

Remaining Time 0:00

详情

字幕

Inter-GPU communication is central in training DL networks on multiple GPUs. The NCCL library is covering that role in most frameworks, being used by PyTorch, Horovod, and others. Learn how hardware choices can directly impact performance at scale, and what performance to expect from various platforms, including DGX systems. Understand why NVLink is critical to large-scale computing and how it's combined with Infiniband/RoCE to deliver orders-of-magnitude higher performance than standard off-the-shelf systems. Also discover the new communication patterns DL training uses and how node and fabric topology can impact them. Finally, learn how the NCCL API has evolved to serve new needs in parallel computing, including HPC workloads.

活动: GTC Digital April

日期: April 2021

级别: 高级技术

话题: Deep Learning Training

行业: Supercomputing

语言: 英语

所在地: