Name: LLM Inference Performance and Optimization on NVIDIA GB200 NVL72 S72503 | GTC 2025 | NVIDIA On-Demand
Uploaded: 2025-03-19T09:00:00Z
Duration: 2375 s
Description: In this session, we will dive into the GB200 NVL72 architecture and programming model, highlighting its inference performance on state-of-the-art LLM model

Video Player is loading.

Current Time 0:00

Duration 39:35

Loaded: 0%

Stream Type LIVE

Remaining Time 39:35

详情

字幕

In this session, we will dive into the GB200 NVL72 architecture and programming model, highlighting its inference performance on state-of-the-art LLM models.

We will also explore optimizations techniques that enable the 72 Blackwell GPUs to work together through the NVIDIA NVLINK, functioning as one giant GPU.

活动: GTC 25

日期: March 2025

话题: AI Platforms / Deployment - AI Inference / Inference Microservices

行业: 所有行业

级别: 通用

NVIDIA 技术: Grace CPU,TensorRT,Hopper,NVLink / NVSwitch,Blackwell

语言: 英语

所在地: