Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
    • Chapters
    • descriptions off, selected
    • subtitles off, selected
      • Quality

      007 Evaluations for Your Customer Assistant LLM Agent: No Time for Hallucinations

      , Solutions Architect, NVIDIA
      , Solutions Architect , NVIDIA
      , Solutions Architect, NVIDIA
      Evaluate and optimize an LLM customer assistant, ensuring it performs robustly, accurately, and with no hallucinations in real-world scenarios. Leverage the NeMo Evaluator Microservice to create unit tests to track Gen AI model quality continuously during the development:

      Optimize accuracy in your local languages for multilingual deployments and AI sovereignty.
      Employ LLM-as-a-judge to ensure truthfulness and toxicity.
      Evaluate end-to-end RAG applications.

      Master techniques for visualizing evaluation results, and use these insights to make informed decisions and improvements to your LLM agents. You'll have access to a k8s deployment of NeMo Microservices pipeline and NIM, which will be ready for experimentation.
      Prerequisite(s):

      Basic understanding of LLMs.
      Basic understanding of evaluations.
      Python.
      活动: GTC 25
      日期: March 2025
      话题: AI Platforms / Deployment - AI Inference / Inference Microservices
      行业: 所有行业
      NVIDIA 技术: Cloud / Data Center GPU,DGX,HGX,MGX,NeMo,NVIDIA NIM,NVIDIA AI Enterprise
      级别: 通用
      语言: 英语
      所在地: