Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
详情
字幕
Mastering Speech AI for Multilingual Multimedia Transformation
, Senior Product Manager, NVIDIA
, Machine Learning Engineer, OVHcloud
Creating practical real-time speech-AI-based applications requires sophisticated software to handle natural speech, different accents, and domain-specific vocabularies for various languages and environments. Learn about building real-time multimedia transcription, from selecting and optimizing speech AI models to API deployment. We'll show you how to add subtitles and dubbing in a specific language using Riva speech recognition, text-to-speech, and translation. Also, we'll discuss advanced features such as speaker diarization and text/video extraction. We'll demonstrate customization techniques such as domain-specific jargon adaptation (medical, legal, etc.) to improve speech transcription and synthesized speech for different pronunciations, tones, and accents. Finally, we'll bring everything together by showing how to build a simple web application that automatically creates subtitles and dubs in a targeted language.
活动: GTC 24
日期: March 2024
行业: 所有行业
级别: 初级技术
NVIDIA 技术: Cloud / Data Center GPU,NeMo,TensorRT,Triton