NVIDIA Triton Inference Server on AWS: Customer success stories and AWS deployment methods to optimize inference throughput, reduce latency, and lower GPU or CPU inference costs.

, Amazon
, NVIDIA
, Amazon
, Amazon
, Amazon
NVIDIA Triton inference server simplifies the deployment of AI models at scale in production. This open-source inference-serving software lets teams deploy trained AI models from any framework (TensorFlow, PyTorch, ONNX Runtime, TensorRT, or custom) from AWS SageMaker, ECS, and EKS on a GPU or CPU. Customers can now benefit from the performance optimizations, dynamic batching, and multi-framework support provided by Triton in AWS. Learn how customers use Triton to improve their inference performance. We'll discuss how to deploy NVIDIA Triton in AWS including Amazon SageMaker, EKS, and ECS for GPU-based inference. We'll also discuss getting-started resources. The integration of NVIDIA Triton Inference Server with Amazon SageMaker is available in all AWS regions where Amazon SageMaker is available.
活动: GTC Digital November
日期: November 2021
行业: 所有行业
话题: Deep Learning - Inference
级别: 中级技术
语言: 英语
话题: Accelerated Computing & Dev Tools - Performance Optimization
所在地: