Large-Scale Production Deployment of RAG Pipelines

, Sr. Deep Learning Data Scientist, NVIDIA
高度评价
Retrieval augmented generation (RAG) pipelines are already changing every aspect of modern enterprise operation. There are countless online tutorials demonstrating proof-of-concept-level naïve RAG applications incapable of dealing with large volumes of traffic and large document volumes. This training lab will bridge this gap and discuss an opinionated best practice for production-level deployment. From infrastructure sizing through breaking down end-to-end Helm-based deployment to customizing individual pipeline components, we'll provide a high-level overview of steps your organization will have to take to transform early proofs of concept into enterprise-grade deployments.
Prerequisite(s):

Familiarity working with LLM based applications


Explore more training options offered by the NVIDIA Deep Learning Institute (DLI). Choose from an extensive catalog of self-paced, online courses or instructor-led virtual workshops to help you develop key skills in AI, HPC, graphics & simulation, and more.
Ready to validate your skills? Get NVIDIA certified and distinguish yourself in the industry.

活动: GTC 24
日期: March 2024
行业: 所有行业
级别: 中级技术
话题: Large Language Models (LLMs)
NVIDIA 技术: NeMo,TensorRT,Triton
语言: 英语
所在地: