Name: Efficient At-Scale Training and Deployment of Large Language Models A41200 | GTC Digital September 2022 | NVIDIA On-Demand
Uploaded: 2022-09-21T14:00:00Z
Duration: 2539 s
Description: NeMo Megatron enables enterprises to easily train and deploy huge transformer models at scale using several parallelism techniques

Video Player is loading.

Current Time 0:00

Duration 0:00

Loaded: 0%

Stream Type LIVE

Remaining Time 0:00

详情

字幕

NeMo Megatron enables enterprises to easily train and deploy huge transformer models at scale using several parallelism techniques. In this talk, we will explain how to preprocess data in a multi-node environment, automatically select the best hyperparameters to minimize the time-to-train for multiple GPT-3 and T5 configurations, train the model at-scale and deploy the model in a multi-node production setting with an easy-to-use set of scripts. NeMo Megatron automates the workflow, reduces the time to deployment, and reduces the total cost of ownership. In addition, we are going to show how to create prompts automatically to adapt the model for different downstream tasks.

活动: GTC Digital September

日期: September 2022

行业: 所有行业

级别: 初级技术

话题: Conversational AI / NLP

语言: 英语

话题: Deep Learning - Frameworks

所在地: