云端 Triton 生产实践 Triton in the Cloud: A Practical Way

, Staff Engineer, Alibaba Cloud
, Senior Engineer, Alibaba Cloud
, Staff Engineer, Alibaba Cloud
Triton Inference Server is a full-featured, extensible, and powerful inferencing solution on both the edge and cloud sides. When deploying Triton to production in the cloud, efficiency, scalability, and integration with infrastructure other than the server itself should be taken into consideration. We'll cover the key insights from providing Triton as a cloud service via EAS in AliCloud: 1) One click to set up a Triton cluster; 2) Scaling Triton cluster with income requests throughput; 3) Native integration with OSS (Object Storage Service); and 4) Triton and GPU-sharing scheduling.
活动: GTC Digital Spring
日期: March 2022
行业: Cloud Services
话题: Data Center / Cloud Infrastructure - Technical
级别: 中级技术
语言: 英语
所在地: