NGC | Catalog
CatalogContainersGPU Accelerated ML workflows with RAPIDS

GPU Accelerated ML workflows with RAPIDS

For copy image paths and more information, please view on a desktop device.
Logo for GPU Accelerated ML workflows with RAPIDS


Demonstration of GPU Accelerated Machine Learning Data Science workflows using RAPIDS.



Latest Tag



September 1, 2023

Compressed Size

3.86 GB

Multinode Support


Multi-Arch Support


20.11 (Latest) Scan Results

Linux / amd64

Accelerated Data Science with RAPIDS

This container provides a demonstration of GPU Accelerated Data Science workflows using RAPIDS.


The RAPIDS suite of open source software libraries and APIs gives you the ability to execute end-to-end data science and analytics pipelines entirely on GPUs. Licensed under Apache 2.0, RAPIDS is incubated by NVIDIA® based on extensive hardware and data science science experience. RAPIDS utilizes NVIDIA CUDA® primitives for low-level compute optimization, and exposes GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

RAPIDS also focuses on common data preparation tasks for analytics and data science. This includes a familiar dataframe API that integrates with a variety of machine learning algorithms for end-to-end pipeline accelerations without paying typical serialization costs. RAPIDS also includes support for multi-node, multi-GPU deployments, enabling vastly accelerated processing and training on much larger dataset sizes.

To learn more

Please review the following resources:

Installation and Getting Started

Getting started with the application is pretty straightforward with nvidia-docker.

Running from NGC container

This image contains the complete RAPIDS Jupyter Lab environment and tutorial.

1. Download the container from NGC

docker pull

2. Run the notebook server

docker run --gpus all --rm -it -p 8888:8888

Note: Depending on your docker version you may have to use ‘docker run --runtime=nvidia’ or remove ‘--gpus all’

3. Connect to notebook server

Jupyter Lab will be available on port 8888!

e.g. if running on a local machine

(or first available port after that, 8889, 8890 etc if 8888 is occupied - see command output)

4. Run the notebooks

  • utils/data_loader: Loads 1 year of airline dataset for ML100 and 10 years for ML200.
  • ML100/1_ML100-gpu: ML100 Data Science workflow on single GPU with RAPIDS.
  • ML100/2_ML100-cpu: ML100 Data Science workflow on CPU for comparison.
  • ML200/ML200: Multi-GPU Data Science Workflow with large dataset using RAPIDS and Dask.

Getting Help & Support

If you have any questions or need help, please email