Deploy and Scale AI Deep Learning Models Easily with Triton Inference Server

, NVIDIA
Triton Inference Server is a model serving software that simplifies the deployment of AI models at-scale in production. It allows teams to deploy trained AI models from any framework (NVDIA TensorFlow, NVIDIA TensorRT, PyTorch, ONNX Runtime, or a custom framework) on any GPU- or CPU-based infrastructure (cloud, data center, or edge). Learn about high performance inference serving with Triton's concurrent execution, dynamic batching and features, and deploying in different environments, through integrations, using Kubernetes/EKS and other tools.
活动: AWS reInvent
日期: December 2020
级别: 高级技术
行业: Cloud Services
话题: Deep Learning Inference
语言: Chinese(Simplified), English, Japanese, Korean, Chinese(Traditional)
所在地: