Name: Simplify and Scale Model Serving with NVIDIA Triton Inference Server on Google Cloud Vertex AI Prediction (Presented by Google) S42534 | GTC Digital Spring 2022 | NVIDIA On-Demand
Uploaded: 2022-03-23T10:00:00Z
Duration: 2128 s
Description: NVIDIA Triton Inference Server (Triton) is an open-source inference serving software that maximizes performance and simplifies model deployment at scale

详情

字幕

NVIDIA Triton Inference Server (Triton) is an open-source inference serving software that maximizes performance and simplifies model deployment at scale. Triton supports multiple frameworks (TensorRT, TensorFlow, ONNX, PyTorch, and more) with custom CUDA and Python backends on GPU-/CPU-based infrastructure on cloud, data center, and edge. Google Cloud and NVIDIA collaborated to add Triton as a backend on Vertex AI Prediction, Google Cloud's fully managed model serving platform.

活动: GTC Digital Spring

日期: March 2022

行业: 所有行业

话题: Deep Learning - Inference

级别: 中级技术

语言: 英语

所在地: