Name: Taking AI Models to Production: Accelerated Inference with Triton Inference Server S51276 | GTC Digital Spring 2023 | NVIDIA On-Demand
Uploaded: 2023-03-21T13:00:00Z
Duration: 2236 s
Description: AI machine learning and deep learning inference is expected to grow faster than training

Video Player is loading.

Current Time 0:00

Duration 37:16

Loaded: 0%

Stream Type LIVE

Remaining Time 37:16

详情

字幕

AI machine learning and deep learning inference is expected to grow faster than training. But the complexity that teams must manage to deploy, run, and scale models in production is enormous — multiple frameworks, evolving model architectures, volume of queries, diverse computing platforms, cloud-to-the-edge AI. There's a need to standardize and streamline inference without losing model performance. We'll look at the recent additions to the open-source inference serving software, Triton Inference Server, in three broad categories — support for new frameworks and workloads, inference workflow optimization tools, and scaling. We'll also cover integrations with other deployment platforms and tools and look at how some customers are approaching this complexity with Triton and achieving the key performance indicators they've set for their businesses.

活动: GTC Digital Spring

日期: March 2023

行业: 所有行业

级别: 初级技术

话题: Deep Learning - Inference

语言: 英语

话题: Deep Learning

所在地: