Name: 007 Evaluations for Your Customer Assistant LLM Agent: No Time for Hallucinations DLIT71470 | GTC 2025 | NVIDIA On-Demand
Uploaded: 2025-03-19T08:00:00Z
Duration: 3677 s
Description: Evaluate and optimize an LLM customer assistant, ensuring it performs robustly, accurately, and with no hallucinations in real-world scenarios

Video Player is loading.

Current Time 0:00

Duration 0:00

Loaded: 0%

Stream Type LIVE

Remaining Time 0:00

详情

字幕

Evaluate and optimize an LLM customer assistant, ensuring it performs robustly, accurately, and with no hallucinations in real-world scenarios. Leverage the NeMo Evaluator Microservice to create unit tests to track Gen AI model quality continuously during the development:

Optimize accuracy in your local languages for multilingual deployments and AI sovereignty.
Employ LLM-as-a-judge to ensure truthfulness and toxicity.
Evaluate end-to-end RAG applications.

Master techniques for visualizing evaluation results, and use these insights to make informed decisions and improvements to your LLM agents. You'll have access to a k8s deployment of NeMo Microservices pipeline and NIM, which will be ready for experimentation.
Prerequisite(s):

Basic understanding of LLMs.
Basic understanding of evaluations.
Python.

活动: GTC 25

日期: March 2025

话题: AI Platforms / Deployment - AI Inference / Inference Microservices

行业: 所有行业

NVIDIA 技术: Cloud / Data Center GPU,DGX,HGX,MGX,NeMo,NVIDIA NIM,NVIDIA AI Enterprise

级别: 通用

语言: 英语

所在地: