The Megatron Multilingual 1.6B Neural Machine Translation model translates text in any to any directions across the 37 supported languages, including non-English centric translation (such as French to Chinese, etc). The Supported languages are: English(en), Czech(cs), Danish (da), German(de), Greek(el), European Spanish(es-ES), LATAM Spansish(es-US), Finnish(fi), France(fr), Hungarian(hu), Italian(it), Lithuanian(lt), Latvian(lv),Dutch(nl), Norwegian(no), Polish(pl), European Portuguese(pt-PT), Brazillian Portuguese(pt-BR), Romanian(ro), Russian(ru), Slovak(sk), Swedish(sv), Simplified Chinese(zh-CN), Traditional Chinese(zh-TW), Japanese(ja), Hindi(hi), Korean(ko), Estonian(et), Slovenian(sl), Bulgarian(bg), Ukrainian(uk), Croatian(hr), Arabic(ar), Vietnamese(vi), Turkish(tr), Indonesian(id), Thai(th). This model is ready for commercial use.
Architecture Type: Transformer
Network Architecture: Megatron
The model is based on Transformer architecture originally presented in "Attention Is All You Need" paper [1]. In this particular instance, the model has 24 layers in the encoder and 24 layers in the decoder. It is using SentencePiece tokenizer [2].
Input Type(s): Text String
Input Format(s): List
Other Properties Related to Input: No Pre-Processing Needed; No Tokenization required; 1024 Character Text String Limit (No non-textual characters)
Output Type(s): Text String
Output Format: List
Output Parameters: Selected Language
Other Properties Related to Output: Outputs are not tokenized or processed to hide sensitive input information
[1] Vaswani, Ashish, et al. "Attention is all you need." arXiv preprint arXiv:1706.03762 (2017). [2] https://github.com/google/sentencepiece [3] https://en.wikipedia.org/wiki/BLEU [4] https://github.com/mjpost/sacreBLEU [5] NVIDIA NeMo Toolkit
Runtime Engine(s): [Riva 2.18.0]
Supported Hardware Platform(s):
Supported Operating System(s):
rmir_nmt_megatron_1b_any_any:2.18.0
** Data Collection Method by dataset
** Labeling Method by dataset
The performance of the model from Any -> Any direction for Flores-101 dataset
|-----------|-------|-------|-------|-------|-------|-------|-------|
Languages | de | es-es | es-us | fr | ja | ru | zh-cn |
---|---|---|---|---|---|---|---|
de | - | 24.50 | 24.10 | 39.30 | 27.30 | 26.10 | 33.30 |
es-es | 22.10 | - | - | 30.30 | 23.50 | 20.20 | 29.80 |
es-us | 22.10 | - | - | 30.30 | 23.50 | 20.20 | 29.80 |
fr | 25 | 24.80 | 30.40 | - | 26.60 | 25.50 | 32.70 |
ja | 16.90 | 16.40 | 18.10 | 23.70 | - | 15.20 | 28.90 |
ru | 22.40 | 21.90 | 26.40 | 33.40 | 25.40 | - | 30.90 |
zh-cn | 17.50 | 17.30 | 19.10 | 25.60 | 16.80 | 23.70 | - |
----------- | ------- | ------- | ------- | ------- | ------- | ------- | ------- |
The performance of any->en and en->any direction for Flores-101 dataset
Language | Eng -> Language | Language -> Eng |
---|---|---|
Czech | 32.90 | 41.10 |
Danish | 46.20 | 49.60 |
German | 38.20 | 45.20 |
Greek | 27.50 | 36.50 |
European Spanish | 27.60 | 30.70 |
Latin American Spanish | 26.80 | 30.70 |
Finnish | 22.70 | 35 |
French | 50.50 | 46.50 |
Hungarian | 26.70 | 36.90 |
Italian | 29.90 | 34.50 |
Lithuanian | 27.50 | 35.10 |
Latvian | 31.00 | 37.00 |
Dutch | 26.70 | 32.60 |
Norwegian | 34.00 | 44.80 |
Polish | 20.80 | 30.30 |
European Portugese | 48.10 | 50.50 |
Brazil Portugese | 49.80 | 50.50 |
Romanian | 40.70 | 45.00 |
Russian | 31.30 | 36.10 |
Slovak | 35 | 40.60 |
Swedish | 45.00 | 49.60 |
Simplified Chinese | 39.50 | 28.50 |
Traditional Chinese | 30.80 | 26.80 |
Japanese | 32.50 | 26.70 |
Hindi | 33.50 | 39.90 |
Korean | 28.00 | 29.50 |
Estonian | 27.30 | 38.90 |
Slovenian | 30.70 | 36.20 |
Bulgarian | 41.80 | 42.10 |
Ukrainian | 30.70 | 40.20 |
Croatian | 27.90 | 37.80 |
Arabic | 28 | 40.60 |
Vietnamese | 41.80 | 36.90 |
Turkish | 29.50 | 38.80 |
Indonesian | 47.20 | 44.90 |
Thai | 30.90 | 28.10 |
Engine: Triton
Test Hardware:
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI Concerns here.