MatchboxNet: 1D Time-Channel Separable Convolutional Neural Network Architecture for Speech Commands Recognition

, NVIDIA
In this session, we’ll discuss MatchboxNet, an end-to-end neural network for speech command recognition. MatchboxNet is composed from blocks of 1D time-channel separable convolution, batch-normalization, ReLU, and dropout layers. It reaches state-of-the-art accuracy on the Google Speech Commands dataset, while having significantly fewer parameters than similar models. We’ll demonstrate how intensive data augmentation, using an auxiliary noise dataset, improves robustness in the presence of background noise and how the small architecture makes it viable for voice activity detection.
活动: AWS reInvent
日期: December 2020
行业: Cloud Services
话题: Conversational AI
级别: 中级技术
语言: Chinese(Simplified), English, Japanese, Korean, Chinese(Traditional)
所在地: