Video Player is loading.
Current Time 0:00
Duration 39:35
Loaded: 0%
Stream Type LIVE
Remaining Time 39:35
 
1x
    • Chapters
    • descriptions off, selected
    • default, selected

    LLM Inference Performance and Optimization on NVIDIA GB200 NVL72

    , Developer Technology Engineer, NVIDIA
    In this session, we will dive into the GB200 NVL72 architecture and programming model, highlighting its inference performance on state-of-the-art LLM models.

    We will also explore optimizations techniques that enable the 72 Blackwell GPUs to work together through the NVIDIA NVLINK, functioning as one giant GPU.
    活动: GTC 25
    日期: March 2025
    话题: AI Platforms / Deployment - AI Inference / Inference Microservices
    行业: 所有行业
    级别: 通用
    NVIDIA 技术: Grace CPU,TensorRT,Hopper,NVLink / NVSwitch,Blackwell
    语言: 英语
    所在地: