Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
详情
字幕
Optimizing & Deploying PyTorch Models for High-Performance Inference
, Product Manager for Deep Learning Inference - TensorRT, NVIDIA
, Staff Software Engineer, Meta
, Software Engineer, Meta
Learn about optimizing & deploying dynamic PyTorch models in Python for production. We’ll cover the new `torch.package` and `torch::deploy` interfaces as well as tools for extracting performance out of models like compression toolkits, torch.fx, and more. Then we’ll give the latest updates on Torch-TensorRT for maximizing performance on GPUs, including a technical deep dive into how torch.fx is being applied to go directly from a PyTorch model to TensorRT, entirely in Python. Participants will understand the software stack for PyTorch provides uncompromised flexibility, usability, and performance today on NVIDIA GPUs, and future plans for delivering.