Name: Retrieval-Augmented Language Model and Its Application for Question-Answering and Image Captioning S51916 | GTC Digital Spring 2023 | NVIDIA On-Demand
Uploaded: 2023-03-23T10:00:00Z
Duration: 2694 s
Description: Language models (LMs) can be largely improved by retrieving from a large-scale text corpus

详情

字幕

Language models (LMs) can be largely improved by retrieving from a large-scale text corpus. In particular, augmenting generative language models with a retrieval module at the pre-training stage (e.g., RETRO) can significantly reduce perplexity on the held-out dataset. However, apart from lower perplexity, it remains unknown whether the model like RETRO can obtain similar success in terms of downstream task accuracy and text-generation quality. I'll present a comprehensive study on RETRO compared with the standard GPT model. Specifically, we pre-train RETRO of parameters ranging from 148 million up to 9.5 billion by retrieving over 330 billion tokens. Extensive experimental results show that RETRO, or our proposed variant models, outperforms the standard GPT on (1) open-ended text generation with higher factual accuracy, lower toxicity, and less repetition; (2) LM Evaluation Harness benchmark under both zero-shot and fine-tuning settings; and (3) open-domain question-answering benchmarks. Furthermore, we also augmented the RETRO model for image-to-text generation. The proposed Retrieval-augmented Visual Language Model (Re-ViLM) obtained state-of-the-art zero-and few-shot image captioning results.

活动: GTC Digital Spring

日期: March 2023

行业: Consumer Internet

话题: Conversational AI / NLP

级别: 中级技术

语言: 英语

话题: Applied AI

所在地: