Name: Scaling and Optimizing Your LLM Pipeline for End-to-End Efficiency S62006 | GTC 2024 | NVIDIA On-Demand
Uploaded: 2024-03-21T11:00:00Z
Duration: 1353 s
Description: Are you having trouble getting language models (LLMs) to work in your organization? You're not alone

Video Player is loading.

Current Time 0:00

Duration 0:00

Loaded: 0%

Stream Type LIVE

Remaining Time 0:00

详情

字幕

Are you having trouble getting language models (LLMs) to work in your organization? You're not alone. We'll look at how to deploy an open-source language model on GKE. We'll show data scientists and machine learning engineers how to use NeMo and TRT LLM with GKE's notebooks. Plus, GKE has a unique ability to help orchestrate AI workloads with efficiency and convenience. We'll also demonstrate how to train and tune a language model using NeMo and do a live technical demo of how data science teams can infer these models on GPUs with TRT LLM and GKE.

活动: GTC 24

日期: March 2024

话题: AI 推理

行业: 所有行业

级别: 初级技术

NVIDIA 技术: 云/数据中心 GPU

语言: 英语

所在地: