NGC Catalog
CLASSIC
Welcome Guest
Models
Riva Megatron NMT any2any

Riva Megatron NMT any2any

For downloads and more information, please view on a desktop device.
Description
Machine Translation model for Any to Any direction
Publisher
NVIDIA
Latest Version
2.18.0
Modified
March 7, 2025
Size
2.96 GB

Machine Translation: Multilingual 1.6B Any-Any NMT Model - Model Overview

Description:

The Megatron Multilingual 1.6B Neural Machine Translation model translates text in any to any directions across the 37 supported languages, including non-English centric translation (such as French to Chinese, etc). The Supported languages are: English(en), Czech(cs), Danish (da), German(de), Greek(el), European Spanish(es-ES), LATAM Spansish(es-US), Finnish(fi), France(fr), Hungarian(hu), Italian(it), Lithuanian(lt), Latvian(lv),Dutch(nl), Norwegian(no), Polish(pl), European Portuguese(pt-PT), Brazillian Portuguese(pt-BR), Romanian(ro), Russian(ru), Slovak(sk), Swedish(sv), Simplified Chinese(zh-CN), Traditional Chinese(zh-TW), Japanese(ja), Hindi(hi), Korean(ko), Estonian(et), Slovenian(sl), Bulgarian(bg), Ukrainian(uk), Croatian(hr), Arabic(ar), Vietnamese(vi), Turkish(tr), Indonesian(id), Thai(th). This model is ready for commercial use.

Model Architecture

Architecture Type: Transformer

Network Architecture: Megatron

The model is based on Transformer architecture originally presented in "Attention Is All You Need" paper [1]. In this particular instance, the model has 24 layers in the encoder and 24 layers in the decoder. It is using SentencePiece tokenizer [2].

Input:

Input Type(s): Text String
Input Format(s): List

Other Properties Related to Input: No Pre-Processing Needed; No Tokenization required; 1024 Character Text String Limit (No non-textual characters)

Output:

Output Type(s): Text String
Output Format: List
Output Parameters: Selected Language
Other Properties Related to Output: Outputs are not tokenized or processed to hide sensitive input information

References:

[1] Vaswani, Ashish, et al. "Attention is all you need." arXiv preprint arXiv:1706.03762 (2017). [2] https://github.com/google/sentencepiece [3] https://en.wikipedia.org/wiki/BLEU [4] https://github.com/mjpost/sacreBLEU [5] NVIDIA NeMo Toolkit

Software Integration:

Runtime Engine(s): [Riva 2.18.0]

Supported Hardware Platform(s):

  • NVIDIA Ampere
  • NVIDIA Hopper
  • NVIDIA Jetson
  • NVIDIA Lovelace
  • NVIDIA Turing
  • NVIDIA Volta

Supported Operating System(s):

  • Linux
  • Linux 4 Tegra

Model Version(s):

rmir_nmt_megatron_1b_any_any:2.18.0

Training & Evaluation Dataset:

** Data Collection Method by dataset

  • [Human]

** Labeling Method by dataset

  • [Automated]

Performance of the models

The performance of the model from Any -> Any direction for Flores-101 dataset

|-----------|-------|-------|-------|-------|-------|-------|-------|

Languages de es-es es-us fr ja ru zh-cn
de - 24.50 24.10 39.30 27.30 26.10 33.30
es-es 22.10 - - 30.30 23.50 20.20 29.80
es-us 22.10 - - 30.30 23.50 20.20 29.80
fr 25 24.80 30.40 - 26.60 25.50 32.70
ja 16.90 16.40 18.10 23.70 - 15.20 28.90
ru 22.40 21.90 26.40 33.40 25.40 - 30.90
zh-cn 17.50 17.30 19.10 25.60 16.80 23.70 -
----------- ------- ------- ------- ------- ------- ------- -------

The performance of any->en and en->any direction for Flores-101 dataset

Language Eng -> Language Language -> Eng
Czech 32.90 41.10
Danish 46.20 49.60
German 38.20 45.20
Greek 27.50 36.50
European Spanish 27.60 30.70
Latin American Spanish 26.80 30.70
Finnish 22.70 35
French 50.50 46.50
Hungarian 26.70 36.90
Italian 29.90 34.50
Lithuanian 27.50 35.10
Latvian 31.00 37.00
Dutch 26.70 32.60
Norwegian 34.00 44.80
Polish 20.80 30.30
European Portugese 48.10 50.50
Brazil Portugese 49.80 50.50
Romanian 40.70 45.00
Russian 31.30 36.10
Slovak 35 40.60
Swedish 45.00 49.60
Simplified Chinese 39.50 28.50
Traditional Chinese 30.80 26.80
Japanese 32.50 26.70
Hindi 33.50 39.90
Korean 28.00 29.50
Estonian 27.30 38.90
Slovenian 30.70 36.20
Bulgarian 41.80 42.10
Ukrainian 30.70 40.20
Croatian 27.90 37.80
Arabic 28 40.60
Vietnamese 41.80 36.90
Turkish 29.50 38.80
Indonesian 47.20 44.90
Thai 30.90 28.10

Inference:

Engine: Triton

Test Hardware:

  • NVIDIA Volta V100
  • NVIDIA Turing T4
  • NVIDIA A100 GPU
  • NVIDIA A30 GPU
  • NVIDIA A10 GPU
  • NVIDIA H100 GPU
  • NVIDIA L4 GPU
  • NVIDIA L40 GPU
  • NVIDIA Jetson Orin
  • NVIDIA Jetson AGX Xavier
  • NVIDIA Jetson NX Xavier

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI Concerns here.