Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
    • Chapters
    • descriptions off, selected
    • subtitles off, selected

      Fast, Scalable, and Standardized AI Inference Deployment for Multiple Frameworks, Diverse Models on CPUs and GPUs with Open-source NVIDIA Triton

      , Product Manager, NVIDIA
      , Product Marketing Manager, NVIDIA
      We'll go over NVIDIA Triton Inference Server software and what's new. Triton is an open-source inference-serving software for fast and scalable AI in applications. Learn how Triton helps deploy models from all popular frameworks — including TensorFlow, PyTorch, ONNX, TensorRT, RAPIDS FIL (for XGBoost, Scikit-learn Random Forest, LightGBM), OpenVINO, Python, and even custom C++ backends. Also learn about the features that help optimize inference for multiple query types — real-time, batch, streaming, and model ensembles. We'll cover how to deploy a standardized inference in production on both NVIDIA GPUs and x86 and ARM CPUs in cloud or data center, enterprise edge, and even on embedded devices like the NVIDIA Jetson, as well as how to use Triton in virtualized environments (e.g. VMware vSphere), Kubernetes, and machine-learning platforms like Amazon SageMaker, Azure ML, and Google Vertex AI.
      活动: GTC Digital Spring
      日期: March 2022
      行业: 所有行业
      级别: 初级技术
      话题: Deep Learning - Inference
      语言: 英语
      所在地: