Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
    • Chapters
    • descriptions off, selected
    • subtitles off, selected

      LLM Inference Sizing: Benchmarking End-to-End Inference Systems

      , Solutions Architect, NVIDIA
      , Solution Architect, NVIDIA
      高度评价
      Learn how to choose the right path for your AI initiatives by understanding the key metrics in large language model (LLM) inference sizing. This talk will equip you with essential tools to optimize performance by dissecting LLM inference benchmarks and comparing configurations. We'll demonstrate how NVIDIA's software ecosystem can be leveraged to elevate your AI applications by supporting various layers of abstraction for inference. We'll share best practices and tips to allow you to bring unmatched efficiency and effectiveness to your LLM Inference projects.
      活动: GTC 24
      日期: March 2024
      行业: 所有行业
      级别: 中级技术
      话题: Natural Language Processing (NLP)
      NVIDIA 技术: TensorRT,Triton
      语言: 英语
      所在地: