Model
BERT Large PaddlePaddle checkpoint pretrained with LAMB optimizer using AMP
Use the NGC CLI to download:
Copied!
1 Version
22.08.0_amp_optim-lambSelected11/23/2022 1:04 AM UTC5.03 GBAccuracy: 00 EpochsBatch Size: 0GPU: A100 Copied!
22.08.0_amp_optim-lambSelected
11/23/2022 1:04 AM UTC5.03 GBAccuracy: 00 EpochsBatch Size: 0GPU: A100
Copied!
architecture
| Key | Value |
|---|---|
| type | Large |
performance
| Key | Value |
|---|---|
| training_loss | 1.41 |
training
| Key | Value |
|---|---|
| global_batch_size_phase2 | 32768 |
| global_batch_size_phase1 | 65536 |
| iterations_phase1 | 7038 |
| LR_phase2 | 0.004 |
| LR_phase1 | 0.006 |
| iterations_phase2 | 1563 |
| training_precision | AMP |
| bs_phase2 | 32 |
| warmup_proportion_phase2 | 0.128 |
| bs_phase1 | 256 |
| warmup_proportion_phase1 | 0.2843 |
| iterations | 8601 |