Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
    • Chapters
    • descriptions off, selected
    • subtitles off, selected
      • Quality

      Efficient At-Scale Training and Deployment of Large Language Models

      , Principal Product Manager- Conversational AI and Deep Learning, NVIDIA
      NeMo Megatron enables enterprises to easily train and deploy huge transformer models at scale using several parallelism techniques. In this talk, we will explain how to preprocess data in a multi-node environment, automatically select the best hyperparameters to minimize the time-to-train for multiple GPT-3 and T5 configurations, train the model at-scale and deploy the model in a multi-node production setting with an easy-to-use set of scripts. NeMo Megatron automates the workflow, reduces the time to deployment, and reduces the total cost of ownership. In addition, we are going to show how to create prompts automatically to adapt the model for different downstream tasks.
      活动: GTC Digital September
      日期: September 2022
      行业: 所有行业
      级别: 初级技术
      话题: Conversational AI / NLP
      语言: 英语
      话题: Deep Learning - Frameworks
      所在地: