NGC Catalog
CLASSIC
Welcome Guest
Models
WSJ 6-Gram Language Model

WSJ 6-Gram Language Model

For downloads and more information, please view on a desktop device.
Description
Trained Language Model for Automatic Speech Recognition: Baidu’s CTC decoder with 6-Gram Language Model trained on WSJ corpus.
Publisher
NVIDIA
Latest Version
1
Modified
April 4, 2023
Size
590.59 MB

Overview

Transcripts in Automatic Speech Recognition systems are commonly generated using only an acoustic model, typically an “end-to-end” CTC-based network which matches audio and text without additional alignment information. However, ambiguities in the transcription, for example when collapsing repeated characters and removing blanks, can exist as the CTC-based network has little prior linguistic knowledge. This is where a language model comes in, as it can improve performance by helping solve those decoding ambiguities.

We provide a trained language model on Wall Street Journal data. The language model we use is based on prefix beam search KenLM which imposes a language model constraint on the new predicted character based on previous (most probable) prefixes. Specifically, the model we trained is the Baidu’s CTC decoder with N-Gram LM implementation using 6 as N and train on WSJ data.

Datasets

  • Wall Street Journal sentences from CSR-I (WSJ0) Complete and CSR-II (WSJ1) Complete.

Word Error Rate

  • WSJ eval-92: (+ 6-Gram WSJ Language Model) 2.39%
  • WSJ dev-93: (+ 6-Gram WSJ Language Model) 3.76%