Name: Turn Text into Video: Explore Animated Content Creation with Multimodal Gen AI and NVIDIA NeMo Framework DLIT71194 | GTC 2025 | NVIDIA On-Demand
Uploaded: 2025-03-18T15:00:00Z
Duration: 2801 s
Description: This training explores the principles behind multimodal generative AI, focusing on text-to-video diffusion models and their practical applications

Video Player is loading.

Current Time 0:00

Duration 0:00

Loaded: 0%

Stream Type LIVE

Remaining Time 0:00

详情

字幕

This training explores the principles behind multimodal generative AI, focusing on text-to-video diffusion models and their practical applications. We will begin by defining multimodal generative AI and examining the core mechanisms of text-to-video synthesis. Participants will learn how input descriptions influence video quality and how guardrails—both pre-generation (Pre-Guard) and post-generation (Post-Guard)—help ensure safe and responsible content creation.

We will break down the process of converting text into numerical representations for diffusion models, encoding video into a latent space for efficient processing, and decoding it back into high-quality outputs. Additionally, we will explore methods to extend generated videos using video-to-video models.

Attendees will gain insights into building an interactive text-to-video generation pipeline using Gradio, leveraging multi-GPU processing, and optimizing performance with NVIDIA NeMo Framework for faster video generation. While live model training won’t be conducted due to resource constraints, participants will leave with a strong foundational understanding and practical tools to experiment with text-to-video AI models.

Prerequisite(s):

Programming skills, Python, and knowledge of AI fundamentals.

活动: GTC 25

日期: 2025 年 3 月

行业: 所有行业

级别: 通用

话题: Generative AI - Video Generation

NVIDIA 技术: NGC,Metropolis,Hopper,NeMo,NVIDIA NIM,NVIDIA AI Enterprise

语言: 英语

所在地: