Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
    • Chapters
    • descriptions off, selected
    • subtitles off, selected
      • Quality

      Harnessing Generative AI and Large Language Model With Vision AI Agents

      , Director, Software Engineering, NVIDIA
      , Senior Deep Learning Engineer, NVIDIA
      , Director of Product Management, NVIDIA
      Petabytes of videos and images are generated by organizations using computer vision every day. Insights from the video can be used to identify concerns, boost productivity, improve safety, reduce downtime, and predict outcomes before they happen. Historically, operations teams have had to sift through videos and manually search for incidents – which is costly, relies on accurate metadata, and wholly inefficient.

      Join us to learn how to unleash multi-modal models for instantly deriving business critical insights from videos and images. Multi-modal models will take search prompts from users as input, leveraging AI to immediately generate video and image results. This is a powerful tool, and these models can perform complex reasoning, correlate sequence of events, and understand when exactly an event is triggered and why. This video understanding ability can be used to solve real-world problems across industries, especially in factories, retail, and warehouses where environments are complex and logistically challenging.
      活动: GTC 24
      日期: March 2024
      行业: 所有行业
      级别: 初级技术
      话题: Image / Video Detection & Recognition
      NVIDIA 技术: Metropolis
      语言: 英语
      所在地: