Name: Beyond CUDA: The Case for Block-based GPU Programming A31597 | GTC Digital November 2021 | NVIDIA On-Demand
Uploaded: 2021-11-11T14:00:00Z
Duration: 2014 s
Description: Traditional single instruction, multiple threads (SIMT) programming with CUDA, for all its benefits, can be daunting to machine learning researchers in nee

详情

字幕

Traditional single instruction, multiple threads (SIMT) programming with CUDA, for all its benefits, can be daunting to machine learning researchers in need of fast custom kernels. We'll shed light on alternative programming models capable of improving GPU programmability without too much of an impact on expressivity. Some such models have recently emerged (e.g., TVM, MLIR Affine), but these are rarely applicable beyond dense tensor algebra — making them a poor fit for workloads requiring (for example) custom data structures. We'll describe the design and implementation of Triton, a mid-level programming language that uses _block-based_ abstractions to simplify kernel development and fusion for researchers without any GPU programming expertise.

活动: GTC Digital November

日期: November 2021

话题: Accelerated Computing & Dev Tools - Programming Languages / Compilers

行业: 所有行业

级别: 中级技术

语言: 英语

话题: Deep Learning - Frameworks

所在地: