Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
    • Chapters
    • descriptions off, selected
    • subtitles off, selected
      • Quality

      Fast GPU Inference with TensorRT on Amazon SageMaker

      , NVIDIA
      Deep Neural Network (DNN) model complexity has increased in the last decade, from Alexnet with 61 million parameters to GPT-3 with 175 billion parameters. Running real-time inference on such large models is difficult without a graph “compression” technology that optimizes a pre-trained neural network graph. NVIDIA TensorRT™-optimized DNN models improve the inference speed by 2-3X on a GPU compared to the original models. In this session, you’ll learn about NVIDIA TensorRT Lite, a developer-friendly path to the TensorRT inference library that allows you to optimize pre-trained DNNs.
      活动: AWS reInvent
      日期: December 2020
      级别: 初级技术
      行业: Cloud Services
      话题: Deep Learning Inference
      语言: Chinese(Simplified), English, Japanese, Korean, Chinese(Traditional)
      所在地: