Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
    • Chapters
    • descriptions off, selected
    • subtitles off, selected

      Taking AI Models to Production: Accelerated Inference with Triton Inference Server

      , Product Marketing Manager, NVIDIA
      AI machine learning and deep learning inference is expected to grow faster than training. But the complexity that teams must manage to deploy, run, and scale models in production is enormous — multiple frameworks, evolving model architectures, volume of queries, diverse computing platforms, cloud-to-the-edge AI. There's a need to standardize and streamline inference without losing model performance. We'll look at the recent additions to the open-source inference serving software, Triton Inference Server, in three broad categories — support for new frameworks and workloads, inference workflow optimization tools, and scaling. We'll also cover integrations with other deployment platforms and tools and look at how some customers are approaching this complexity with Triton and achieving the key performance indicators they've set for their businesses.
      活动: GTC Digital Spring
      日期: March 2023
      行业: 所有行业
      级别: 初级技术
      话题: Deep Learning - Inference
      语言: 英语
      话题: Deep Learning
      所在地: