Model
BERT Base Model trained on uncased Wikipedia and BookCorpus dataset on a sequence length of 512.
Use the NGC CLI to download:
Copied!
1 Version
1Selected03/26/2020 8:30 PM UTC511.75 MBAccuracy: 00 EpochsBatch Size: 0GPU: V100 Copied!
1Selected
03/26/2020 8:30 PM UTC511.75 MBAccuracy: 00 EpochsBatch Size: 0GPU: V100
Copied!
Finetuning Results
| Key | Value |
|---|---|
| GLUE MRPC ACCURACY | 86.52 |
| GLUE MRPC F1 | 90.53 |
| SQUADV1.1 EM | 82.74 |
| SQUADV1.1 F1 | 89.79 |
| SQUADV2.0 EM | 71.24 |
| SQUADV2.0 F1 | 74.32 |
Pretraining Setup
| Key | Value |
|---|---|
| AMP OPTIMIZATION LEVEL | O1 |
| BATCH SIZE PER GPU | 8 |
| LEARNING RATE | 0.4375E-4 |
| NUMBER OF GPUS | 8 |
| NUMBER OF ITERATIONS | 2285714 |