Large-Scale Production Deployment of RAG Pipelines
, Sr. Deep Learning Data Scientist, NVIDIA
高度评价
Retrieval augmented generation (RAG) pipelines are already changing every aspect of modern enterprise operation. There are countless online tutorials demonstrating proof-of-concept-level naïve RAG applications incapable of dealing with large volumes of traffic and large document volumes. This training lab will bridge this gap and discuss an opinionated best practice for production-level deployment. From infrastructure sizing through breaking down end-to-end Helm-based deployment to customizing individual pipeline components, we'll provide a high-level overview of steps your organization will have to take to transform early proofs of concept into enterprise-grade deployments. Prerequisite(s):