NVIDIA
NVIDIA
NMT En Ru Transformer24x6
Model
NVIDIA
NVIDIA
NMT En Ru Transformer24x6

Neural Machine Translation (NMT) model to translate from English to Russian

Model Overview

This model can be used for translating text in source language (En) to a text in target language (Ru).

Model Architecture

The model is based on Transformer "Big" architecture originally presented in "Attention Is All You Need" paper [1]. In this particular instance, the model has 24 layers in the encoder and 6 layers in the decoder. It is using YouTokenToMe tokenizer [2].

Training

These models were trained on a collection of many publicly available datasets comprising roughly a hundred million parallel sentences. The NeMo toolkit [5] was used for training this model over roughly 700k steps.

Datasets

While training this model, we used the following datasets:

Tokenizer Construction

We used the YouTokenToMe tokenizer [2] with separate encoder and decoder BPE tokenizers.

Performance

The accuracy of translation models are often measured using BLEU scores [3]. The model achieves the following sacreBLEU [4] scores on the WMT'13, WMT'14, WMT'18, WMT'19 and WMT'20 test sets

WMT'13 - 30.5
WMT'14 - 44.4
WMT'18 - 35.1
WMT'19 - 35.8
WMT'20 - 25.3

How to Use this Model

The model is available for use in the NeMo toolkit [5], and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.

Automatically load the model from NGC

import nemo
import nemo.collections.nlp as nemo_nlp
nmt_model = nemo_nlp.models.machine_translation.MTEncDecModel.from_pretrained(model_name="nmt_en_ru_transformer24x6")

Translating text with this model

python [NEMO_GIT_FOLDER]/examples/nlp/machine_translation/nmt_transformer_infer.py --model=nmt_en_ru_transformer24x6.nemo --srctext=[TEXT_IN_SRC_LANGUAGE] --tgtout=[WHERE_TO_SAVE_TRANSLATIONS] --target_lang ru --source_lang en

Input

This translate method of the NMT model accepts a list of de-tokenized strings.

Output

The translate method outputs a list of de-tokenized strings in the target language.

Limitations

No known limitations at this time.

References

[1] Vaswani, Ashish, et al. "Attention is all you need." arXiv preprint arXiv:1706.03762 (2017).

[2] https://github.com/VKCOM/YouTokenToMe

Licence

License to use this model is covered by the NGC TERMS OF USE unless another License/Terms Of Use/EULA is clearly specified. By downloading the public and release version of the model, you accept the terms and conditions of the NGC TERMS OF USE. [3] https://en.wikipedia.org/wiki/BLEU

[4] https://github.com/mjpost/sacreBLEU

[5] NVIDIA NeMo Toolkit

Publisher
NVIDIA
NVIDIA
Latest Version1.5
UpdatedApril 4, 2023 UTC
Compressed Size1.74 GB

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.