Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
    • Chapters
    • descriptions off, selected
    • subtitles off, selected

      LLM Inference Performance and Optimization on NVIDIA GB200 NVL72

      , Developer Technology Engineer, NVIDIA
      In this session, we will dive into the GB200 NVL72 architecture and programming model, highlighting its inference performance on state-of-the-art LLM models.

      We will also explore optimizations techniques that enable the 72 Blackwell GPUs to work together through the NVIDIA NVLINK, functioning as one giant GPU.
      活动: GTC 25
      日期: March 2025
      话题: AI Platforms / Deployment - AI Inference / Inference Microservices
      行业: 所有行业
      级别: 通用
      NVIDIA 技术: Grace CPU,TensorRT,Hopper,NVLink / NVSwitch,Blackwell
      语言: 英语
      所在地: