Neural Machine Translation (NMT) model to translate from English to French
April 4, 2023
Model Overview

This model can be used for translating text in source language (En) to a text in target language (Fr).

Model Architecture

The model is based on Transformer "Big" architecture originally presented in "Attention Is All You Need" paper [1]. In this particular instance, the model has 12 layers in the encoder and 2 layers in the decoder. It is using YouTokenToMe tokenizer [2].


These models were trained on a collection of many publicly available datasets comprising of millions of parallel sentences. The NeMo toolkit [5] was used for training this model over roughly 300k steps.


While training this model, we used the following datasets:

Tokenizer Construction

We used the YouTokenToMe tokenizer [2] with shared encoder and decoder BPE tokenizers.


The accuracy of translation models are often measured using BLEU scores [3]. The model achieves the following sacreBLEU [4] scores on the WMT'13 and WMT'14 test sets

WMT13 - 35.3
WMT14 - 41.3

How to Use this Model

The model is available for use in the NeMo toolkit [5], and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.

Automatically load the model from NGC

import nemo
import nemo.collections.nlp as nemo_nlp
nmt_model = nemo_nlp.models.machine_translation.MTEncDecModel.from_pretrained(model_name="nmt_en_fr_transformer12x2")

Translating text with this model

python [NEMO_GIT_FOLDER]/examples/nlp/machine_translation/nmt_transformer_infer.py --model=nmt_en_fr_transformer12x2.nemo --srctext=[TEXT_IN_SRC_LANGUAGE] --tgtout=[WHERE_TO_SAVE_TRANSLATIONS] --target_lang fr --source_lang en


This translate method of the NMT model accepts a list of de-tokenized strings.


The translate method outputs a list of de-tokenized strings in the target language.


No known limitations at this time.


License to use this model is covered by the NGC TERMS OF USE unless another License/Terms Of Use/EULA is clearly specified. By downloading the public and release version of the model, you accept the terms and conditions of the NGC TERMS OF USE.