Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
详情
字幕
LLM Inference Performance and Optimization on NVIDIA GB200 NVL72
, Developer Technology Engineer, NVIDIA
In this session, we will dive into the GB200 NVL72 architecture and programming model, highlighting its inference performance on state-of-the-art LLM models.
We will also explore optimizations techniques that enable the 72 Blackwell GPUs to work together through the NVIDIA NVLINK, functioning as one giant GPU.
活动: GTC 25
日期: March 2025
话题: AI Platforms / Deployment - AI Inference / Inference Microservices