Virtually all clinical and medical knowledge is contained in the rich free text information generated from research papers, physician notes in electronic medical records, lab notebooks and other areas that span across healthcare and life sciences applications. Extracting and structuring information from this language, or using it directly as part of analytics pipelines, is a generational challenge that modern natural language processing (NLP) is just beginning to address. The uniqueness of clinical speech and text, however, necessitates domain-specific model architectures. NVIDIA has invested heavily in providing the tools needed to address this challenge and worked with our industry and health systems partners to demonstrate these models' value and promise across a huge range of NLP-enabled tasks. Using OSS technologies and models from NVIDIA, this project will create computable knowledge from unstructured information using an end-to-end pipeline and pretrained biomedical models. Specifically, we will leverage the recent LitCoin challenge dataset to build an end-to-end NER to Entity Linking pipeline in NVIDIA NeMo, and extend the performance and functionality of a reference pipeline together during the workshop. Finally, we will briefly explore next generation approaches to knowledge extraction pipelines using large language models like GPT-3 and NVIDIA’s MT-NLG 530B generative architecture. This work is applicable in various domains like drug target identification, prioritization and repurposing, prior art exploration, clinical trials analysis, and adverse event detection. Prerequisite(s): Basic familiarity with PyTorch Basic understanding of transformer-based NLP models such as BERT Basic familiarity with Jupyter Notebooks
*Please disregard any reference to "Event Code" for access to training materials. "Event Codes" are only valid during the original live session.
Explore more training options offered by the NVIDIA Deep Learning Institute (DLI).