Megatron Multilingual En Any 500M

NGC Catalog

CLASSIC

Welcome Guest

For downloads and more information, please view on a desktop device.

Description

Megatron Multilingual Neural Machine Translation model to translate from English to Any* language Supported languages: cs, da, de, el, es, fi, fr, hu, it, lt, lv, nl, no, pl, pt, ro, ru, sk, sv, zh, ja, hi, ko, et, sl, bg, uk, hr, ar, vi, tr, id

Publisher

NVIDIA

Latest Version

1.0.0

Modified

August 3, 2023

Size

839.85 MB

Model Overview

This model can be used for translating text in source language (32 languages) to a text in target language (En).

Model Architecture

The model is based on Transformer "Big" architecture originally presented in "Attention Is All You Need" paper [1]. In this particular instance, the model has 12 layers in the encoder and 2 layers in the decoder. It is using SentencePiece tokenizer [2].

Training

These models were trained on a collection of many publicly available datasets comprising of millions of parallel sentences.

Tokenizer Construction

We used the SentencePiece tokenizer [2] with shared encoder and decoder BPE tokenizers.

How to Use this Model

The model is available for use in the NeMo toolkit [5], and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.

Translating text with this model

python [NEMO_GIT_FOLDER]/examples/nlp/machine_translation/nmt_transformer_infer_megatron.py model_file=megatronnmt_en_any_500m.nemo srctext=[TEXT_IN_SRC_LANGUAGE] tgtout=[WHERE_TO_SAVE_TRANSLATIONS] source_lang=en target_lang=[TARGET_LANGUAGE]

where [TARGET_LANGUAGE] can be 'cs', 'da', 'de', 'el', 'es', 'fi', 'fr', 'hu', 'it', 'lt', 'lv', 'nl', 'no', 'pl', 'pt', 'ro', 'ru', 'sk', 'sv', 'zh', 'ja', 'hi', 'ko', 'et', 'sl', 'bg', 'uk', 'hr', 'ar', 'vi', 'tr', 'id'

Input

This translate method of the NMT model accepts a list of de-tokenized strings.

Output

The translate method outputs a list of de-tokenized strings in the target language.

Limitations

No known limitations at this time.

References

[1] Vaswani, Ashish, et al. "Attention is all you need." arXiv preprint arXiv:1706.03762 (2017).

[2] https://github.com/google/sentencepiece

[3] https://en.wikipedia.org/wiki/BLEU

[4] https://github.com/mjpost/sacreBLEU

[5] NVIDIA NeMo Toolkit

Licence

This work is licensed under NSCLv1 - Link