Simplify and Scale Model Serving with NVIDIA Triton Inference Server on Google Cloud Vertex AI Prediction (Presented by Google)

, Solutions Architecture , NVIDIA
, Solutions Architect, Machine Learning, Google Cloud
, Software Engineer, Vertex AI Prediction , Google Cloud
NVIDIA Triton Inference Server (Triton) is an open-source inference serving software that maximizes performance and simplifies model deployment at scale. Triton supports multiple frameworks (TensorRT, TensorFlow, ONNX, PyTorch, and more) with custom CUDA and Python backends on GPU-/CPU-based infrastructure on cloud, data center, and edge. Google Cloud and NVIDIA collaborated to add Triton as a backend on Vertex AI Prediction, Google Cloud's fully managed model serving platform.
活动: GTC Digital Spring
日期: March 2022
行业: 所有行业
话题: Deep Learning - Inference
级别: 中级技术
语言: 英语
所在地: