Breeno 机器人/NLP 场景中 GPU 推理加速的演进 Evolution Path of GPU Inference Accelerating on Breeno Bot/NLP Scenario

, Architect, OPPO
, Not Applicable, OPPO Inc.
OPPO, as one of the topmost smartphone manufacturers, launched its virtual voice assistant Breeno, a Bot same as Siri. Natural language processing (NLP) is a key technology for it. We have explored a series of ways to accelerate model inference on GPU on Bot/NLP scenario, such as TVM, ONNX Runtime, and TensorRT. We also designed a framework as our integrated inference engine for both NLP and Recommendation/Search/Ads scenario. In this session, our experience on GPU inference accelerating and inference engine framework will be shared.
活动: GTC Digital Spring
日期: March 2022
话题: Accelerated Computing & Dev Tools - Performance Optimization
行业: Consumer Internet
级别: 中级技术
语言: 英语
所在地: