Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
    • Chapters
    • descriptions off, selected
    • subtitles off, selected
      • Quality

      Mastering Speech AI for Multilingual Multimedia Transformation

      , Senior Product Manager, NVIDIA
      , Machine Learning Engineer, OVHcloud
      Creating practical real-time speech-AI-based applications requires sophisticated software to handle natural speech, different accents, and domain-specific vocabularies for various languages and environments. Learn about building real-time multimedia transcription, from selecting and optimizing speech AI models to API deployment. We'll show you how to add subtitles and dubbing in a specific language using Riva speech recognition, text-to-speech, and translation. Also, we'll discuss advanced features such as speaker diarization and text/video extraction. We'll demonstrate customization techniques such as domain-specific jargon adaptation (medical, legal, etc.) to improve speech transcription and synthesized speech for different pronunciations, tones, and accents. Finally, we'll bring everything together by showing how to build a simple web application that automatically creates subtitles and dubs in a targeted language.
      活动: GTC 24
      日期: March 2024
      行业: 所有行业
      级别: 初级技术
      NVIDIA 技术: Cloud / Data Center GPU,NeMo,TensorRT,Triton
      话题: Speech Recognition / Diarization
      语言: 英语
      所在地: