Name: Fast, Scalable, and Standardized AI Inference Deployment for Multiple Frameworks, Diverse Models on CPUs and GPUs with Open-source NVIDIA Triton S41755 | GTC Digital Spring 2022 | NVIDIA On-Demand
Uploaded: 2022-03-23T08:00:00Z
Duration: 2113 s
Description: We'll go over NVIDIA Triton Inference Server software and what's new

Video Player is loading.

Current Time 0:00

Duration 0:00

Loaded: 0%

Stream Type LIVE

Remaining Time 0:00

详情

字幕

We'll go over NVIDIA Triton Inference Server software and what's new. Triton is an open-source inference-serving software for fast and scalable AI in applications. Learn how Triton helps deploy models from all popular frameworks — including TensorFlow, PyTorch, ONNX, TensorRT, RAPIDS FIL (for XGBoost, Scikit-learn Random Forest, LightGBM), OpenVINO, Python, and even custom C++ backends. Also learn about the features that help optimize inference for multiple query types — real-time, batch, streaming, and model ensembles. We'll cover how to deploy a standardized inference in production on both NVIDIA GPUs and x86 and ARM CPUs in cloud or data center, enterprise edge, and even on embedded devices like the NVIDIA Jetson, as well as how to use Triton in virtualized environments (e.g. VMware vSphere), Kubernetes, and machine-learning platforms like Amazon SageMaker, Azure ML, and Google Vertex AI.

活动: GTC Digital Spring

日期: March 2022

行业: 所有行业

级别: 初级技术

话题: Deep Learning - Inference

语言: 英语

所在地: