Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
    • Chapters
    • descriptions off, selected
    • subtitles off, selected

      Optimizing Large Language Models: An Experimental Approach to Pruning and Fine-Tuning LLama2 7B

      , ML Engineer, Weights & Biases
      In the face of high computational demands from large language models (LLMs), we present an experimental approach to model pruning and fine-tuning to overcome these resource challenges. Navigate through our systematic process of turning a 7 billion parameter LLama2 model into a practical 1.5 billion parameter variant. Learn the iterative sequence for layer removal and realignment, backed by extensive experimentation on truncation techniques. We demonstrate how these compact models can serve as efficient drafters, providing rapid responses while their larger versions address more sophisticated tasks.
      活动: GTC 24
      日期: March 2024
      级别: 高级技术
      行业: 所有行业
      NVIDIA 技术: 云/数据中心 GPU
      话题: MLOps
      语言: 英语
      所在地: