Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
    • Chapters
    • descriptions off, selected
    • subtitles off, selected
      • Quality

      Turn Text into Video: Explore Animated Content Creation with Multimodal Gen AI and NVIDIA NeMo Framework

      , Sr. Data Scientist, NVIDIA
      , Solutions Architect, NVIDIA
      This training explores the principles behind multimodal generative AI, focusing on text-to-video diffusion models and their practical applications. We will begin by defining multimodal generative AI and examining the core mechanisms of text-to-video synthesis. Participants will learn how input descriptions influence video quality and how guardrails—both pre-generation (Pre-Guard) and post-generation (Post-Guard)—help ensure safe and responsible content creation.

      We will break down the process of converting text into numerical representations for diffusion models, encoding video into a latent space for efficient processing, and decoding it back into high-quality outputs. Additionally, we will explore methods to extend generated videos using video-to-video models.

      Attendees will gain insights into building an interactive text-to-video generation pipeline using Gradio, leveraging multi-GPU processing, and optimizing performance with NVIDIA NeMo Framework for faster video generation. While live model training won’t be conducted due to resource constraints, participants will leave with a strong foundational understanding and practical tools to experiment with text-to-video AI models.

      Prerequisite(s):

      Programming skills, Python, and knowledge of AI fundamentals.
      活动: GTC 25
      日期: 2025 年 3 月
      行业: 所有行业
      级别: 通用
      话题: Generative AI - Video Generation
      NVIDIA 技术: NGC,Metropolis,Hopper,NeMo,NVIDIA NIM,NVIDIA AI Enterprise
      语言: 英语
      所在地: