Name: SynGatorTron: A Large Clinical Natural Language Generation Model for Synthetic Data Generation and Zero-shot Tasks S41638 | GTC Digital Spring 2022 | NVIDIA On-Demand
Uploaded: 2022-03-23T07:00:00Z
Duration: 1619 s
Description: We propose to develop SynGatorTron using a GPT-3-based architecture implemented in the NVIDIA Megatron framework and the HiPerGator AI cluster deployed at

详情

字幕

We propose to develop SynGatorTron using a GPT-3-based architecture implemented in the NVIDIA Megatron framework and the HiPerGator AI cluster deployed at the University of Florida (with 140 NVIDIA DGX A100 SuperPods) to generate naturally de-identified, pre-training scale synthetic clinical text as a surrogate for training large clinical transformers. Synthetic clinical text generation offers a route to building large, naturally de-identified clinical corpora at a scale that's practically impossible through manual labeling, de-identification, and other privacy-preserving methods. Such a model could preserve the knowledge of medical language but mitigate the risks caused by the sensitive nature of clinical text, and provide few- and zero-shot encoder task capabilities without the need for extensive labeled datasets and structured clinical ontologies.

活动: GTC Digital Spring

日期: March 2022

话题: Conversational AI / NLP

行业: 医疗健康与生命科学

级别: 中级技术

语言: 英语

所在地: