Video Player is loading.
Current Time 0:00
Duration 37:16
Loaded: 0%
Stream Type LIVE
Remaining Time 37:16
 
1x
    • Chapters
    • descriptions off, selected
    • default, selected

    Taking AI Models to Production: Accelerated Inference with Triton Inference Server

    , Product Marketing Manager, NVIDIA
    AI machine learning and deep learning inference is expected to grow faster than training. But the complexity that teams must manage to deploy, run, and scale models in production is enormous — multiple frameworks, evolving model architectures, volume of queries, diverse computing platforms, cloud-to-the-edge AI. There's a need to standardize and streamline inference without losing model performance. We'll look at the recent additions to the open-source inference serving software, Triton Inference Server, in three broad categories — support for new frameworks and workloads, inference workflow optimization tools, and scaling. We'll also cover integrations with other deployment platforms and tools and look at how some customers are approaching this complexity with Triton and achieving the key performance indicators they've set for their businesses.
    活动: GTC Digital Spring
    日期: March 2023
    行业: 所有行业
    级别: 初级技术
    话题: Deep Learning - Inference
    语言: 英语
    话题: Deep Learning
    所在地: