Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
详情
字幕
Build Next-Gen Agents With Large Vision Language Models
, Senior Solutions Architect , NVIDIA
, Product Marketing Manager, NVIDIA
, Technical Marketing Engineer, NVIDIA
Vision-language models (VLM) are taking computer vision by storm, offering scalability and robust zero-shot solution for countless industries. In this lab, you will learn how to use VLM NIMs and the Video Search and Summarization AI Blueprint (VSS). You will get hands on experience calling VLM NIM APIs and build a video understanding agent. Then you will dive into the inner workings of VSS to learn how VLMs, LLMs and the latest Graph RAG techniques are used to build a powerful agent capable of summarizing and answering questions over long videos. Prerequisite(s):
Python and Computer Vision.
活动: GTC 25
日期: March 2025
行业: 所有行业
NVIDIA 技术: Cloud / Data Center GPU,DeepStream,TensorRT,NeMo,Video Storage Toolkit (VST),NVIDIA NIM