This model can be used for translating text in source language (En) to a text in target language (Ru).
The model is based on Transformer "Big" architecture originally presented in "Attention Is All You Need" paper . In this particular instance, the model has 24 layers in the encoder and 6 layers in the decoder. It is using YouTokenToMe tokenizer .
These models were trained on a collection of many publicly available datasets comprising roughly a hundred million parallel sentences. The NeMo toolkit  was used for training this model over roughly 700k steps.
While training this model, we used the following datasets:
We used the YouTokenToMe tokenizer  with separate encoder and decoder BPE tokenizers.
The accuracy of translation models are often measured using BLEU scores . The model achieves the following sacreBLEU  scores on the WMT'13, WMT'14, WMT'18, WMT'19 and WMT'20 test sets
WMT'13 - 30.5 WMT'14 - 44.4 WMT'18 - 35.1 WMT'19 - 35.8 WMT'20 - 25.3
The model is available for use in the NeMo toolkit , and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.
import nemo import nemo.collections.nlp as nemo_nlp nmt_model = nemo_nlp.models.machine_translation.MTEncDecModel.from_pretrained(model_name="nmt_en_ru_transformer24x6")
python [NEMO_GIT_FOLDER]/examples/nlp/machine_translation/nmt_transformer_infer.py --model=nmt_en_ru_transformer24x6.nemo --srctext=[TEXT_IN_SRC_LANGUAGE] --tgtout=[WHERE_TO_SAVE_TRANSLATIONS] --target_lang ru --source_lang en
This translate method of the NMT model accepts a list of de-tokenized strings.
The translate method outputs a list of de-tokenized strings in the target language.
No known limitations at this time.
 Vaswani, Ashish, et al. "Attention is all you need." arXiv preprint arXiv:1706.03762 (2017).