Optimizing Deep Learning Inference using NVIDIA GPUs on AWS Cloud

, NVIDIA
Deep learning inference is a compute-intensive workload that affects user-experience. Real-time applications need low latency and data center efficiency requires high throughput. In this session, we’ll demonstrate how developers can use NVIDIA TensorRT to optimize neural network models, trained in all major frameworks, and deploy those optimized models in the cloud or at the edge. We’ll also show code samples to demonstrate the workflow with various frameworks, as well as how to calibrate for lower precision with high accuracy.
活动: AWS reInvent
日期: November 2020
级别: 初级技术
话题: Deep Learning Inference
语言: 英语
所在地: