Name: Accelerating the LLM Life Cycle on the Cloud S63152 | GTC 2024 | NVIDIA On-Demand
Uploaded: 2024-03-18T09:00:00Z
Duration: 3067 s
Description: Learn some practical strategies for developing large language models (LLMs) in the cloud from start to finish

Video Player is loading.

Current Time 0:00

Duration 0:00

Loaded: 0%

Stream Type LIVE

Remaining Time 0:00

详情

字幕

Learn some practical strategies for developing large language models (LLMs) in the cloud from start to finish. Our session is tailored for machine learning practitioners — a background in cloud operations isn't necessary. Dive into an efficient workflow that spans from data preparation and model pre-training to fine-tuning and model serving via high-performance API. Our journey begins with compiling a massive 1 trillion-token dataset, setting the stage for the pre-training of a 1.1 billion-parameter Llama model using a powerful array of 64x NVIDIA A100 GPUs. Hyperparameter search will be used to tune the optimizer prior to the training run. We’ll then introduce approaches for alignment and instruction tuning with DPO and QLoRA on A10 GPUs and the rollout of a scalable, high-performance inference API. We'll deliver a blend of theoretical understanding and practical guidance, complete with deploying a serverless interactive chat application. You'll gain firsthand knowledge of each stage, equipped with resources such as datasets, models, and source codes, all within readily available, reproducible cloud-based environments.

活动: GTC 24

日期: March 2024

行业: 所有行业

级别: 初级技术

NVIDIA 技术: Cloud / Data Center GPU,CUDA,cuDNN,NCCL,NVLink / NVSwitch

话题: MLOps

语言: 英语

所在地: