Deciphering the Language of Antibodies using Self-supervised Learning

, Associate Director of Data Science, Alchemab Therapeutics, Ltd.
An individual's B cell receptor (BCR) repertoire encodes information about past immune responses, and potential for future disease protection. One of the grand challenges of BCR sequence analysis is predicting BCR properties from their amino acid sequence alone, with over 1 billion BCRs in a single patient. We'll present an antibody-specific language model, AntiBERTa, trained on millions of BCR sequences on NVIDIA's GPUs. AntiBERTa captures biologically meaningful information about BCRs, making it generalizable for many downstream applications. Learn how AntiBERTa can be fine-tuned to predict key positions for molecule binding from an antibody sequence alone, outperforming public tools across multiple metrics. To our knowledge, AntiBERTa is the deepest protein family-specific language model; it provides a rich representation of BCRs that marks the first step toward a deep learning-guided understanding of the human immune system.
活动: GTC Digital Spring
日期: March 2022
行业: 医疗健康与生命科学
话题: Healthcare – Drug Discovery, Genomics
级别: 中级技术
语言: 英语
所在地: