Name: LLM Inference Sizing: Benchmarking End-to-End Inference Systems S62797 | GTC 2024 | NVIDIA On-Demand
Uploaded: 2024-03-19T05:00:00Z
Duration: 2587 s
Description: Learn how to choose the right path for your AI initiatives by understanding the key metrics in large language model (LLM) inference sizing

Video Player is loading.

Current Time 0:00

Duration 0:00

Loaded: 0%

Stream Type LIVE

Remaining Time 0:00

详情

字幕

Learn how to choose the right path for your AI initiatives by understanding the key metrics in large language model (LLM) inference sizing. This talk will equip you with essential tools to optimize performance by dissecting LLM inference benchmarks and comparing configurations. We'll demonstrate how NVIDIA's software ecosystem can be leveraged to elevate your AI applications by supporting various layers of abstraction for inference. We'll share best practices and tips to allow you to bring unmatched efficiency and effectiveness to your LLM Inference projects.

活动: GTC 24

日期: March 2024

行业: 所有行业

级别: 中级技术

话题: Natural Language Processing (NLP)

NVIDIA 技术: TensorRT,Triton

语言: 英语

所在地: