Name: Fast GPU Inference with TensorRT on Amazon SageMaker | AWS reInvent 2020 | NVIDIA On-Demand
Uploaded: 2020-12-01T23:31:06Z
Duration: 1551 s
Description: Deep Neural Network (DNN) model complexity has increased in the last decade, from Alexnet with 61 million parameters to GPT-3 with 175 billion parameters

Video Player is loading.

Current Time 0:00

Duration 0:00

Loaded: 0%

Stream Type LIVE

Remaining Time 0:00

详情

字幕

Deep Neural Network (DNN) model complexity has increased in the last decade, from Alexnet with 61 million parameters to GPT-3 with 175 billion parameters. Running real-time inference on such large models is difficult without a graph “compression” technology that optimizes a pre-trained neural network graph. NVIDIA TensorRT™-optimized DNN models improve the inference speed by 2-3X on a GPU compared to the original models. In this session, you’ll learn about NVIDIA TensorRT Lite, a developer-friendly path to the TensorRT inference library that allows you to optimize pre-trained DNNs.

活动: AWS reInvent

日期: December 2020

级别: 初级技术

行业: Cloud Services

话题: Deep Learning Inference

语言: Chinese(Simplified), English, Japanese, Korean, Chinese(Traditional)

所在地: