Resource
The Deep Learning Recommendation Model (DLRM) is a recommendation model designed to make use of both categorical and numerical inputs.
Use the NGC CLI to download:
Copied!
Changelog
October 2021
- Added support for CUDA Graphs
- Switched to PyTorch native AMP for mixed precision training
- Unified the single-GPU and multi-GPU training scripts
- Added support for BYO dataset
- Updated performance results
- Updated container version
June 2021
- Updated container version
- Updated performance results
March 2021
- Added NVTabular as a new preprocessing option
- Added a new dataset - xlarge, which uses a frequency threshold of 2
- Introduced a new GPU - A100 80GB, and its performance results
- Updated Spark preprocessing
- Added Adam as an optional optimizer for embedding and MLPs, for multi-GPU training
- Improved README
August 2020
- Preprocessing with Spark 3 on GPU
- Multiple performance optimizations
- Automatic placement and load balancing of embedding tables
- Improved README
June 2020
- Updated performance tables to include A100 results and multi-GPU setup
- Multi-GPU optimizations
May 2020
- Performance optimizations
April 2020
- Initial release
Known issues
- Adam optimizer performance is not optimized.
- For some seeds, the model's loss can become NaN due to aggressive learning rate schedule.
- Custom dot interaction kernels for FP16 and TF32 assume that embedding size <= 128 and number of categorical variables < 32.
Pass
--interaction_op dotto use the slower native operation in those cases.