Name: Optimizing Inference for Neural Machine Translation using Sockeye 2 | AWS reInvent 2020 | NVIDIA On-Demand
Uploaded: 2020-12-01T23:31:06Z
Duration: 1466 s
Description: Transformer networks have revolutionized the field of machine translation

详情

字幕

Transformer networks have revolutionized the field of machine translation. They’ve been shown to produce better translations, especially for long input sentences, than the traditional recurrent neural networks. However, Transformer models have been growing with the latest GPT-3 having 175 billion parameters. Training and inference on such large models is computationally intensive. Learn how the NVIDIA A100 GPU is designed to train and deploy such large networks efficiently and explore the transformer-based model, using Sockeye, the open source NMT implementation that powers Amazon Translate. We’ll also discuss methods for profiling deep learning workloads using NVIDIA Nsight™ and identify areas for improving performance. We’ll also demonstrate the impact of these optimization techniques with cost-effective inference on an Amazon EC2 G4 instance with NVIDIA T4 GPUs.

活动: AWS reInvent

日期: December 2020

行业: Cloud Services

话题: Deep Learning Inference

级别: 中级技术

语言: Chinese(Simplified), English, Japanese, Korean, Chinese(Traditional)

所在地: