Linux / arm64
TensorFlow is an open source platform for machine learning. It provides comprehensive tools and libraries in a flexible architecture allowing easy deployment across a variety of platforms and devices. NGC Containers are the easiest way to get started with TensorFlow. The TensorFlow NGC Container comes with all dependencies included, providing an easy place to start developing common applications, such as conversational AI, natural language processing (NLP), recommenders, and computer vision.
The TensorFlow NGC Container is optimized for GPU acceleration, and contains a validated set of libraries that enable and optimize GPU performance. This container may also contain modifications to the TensorFlow source code in order to maximize performance and compatibility. This container also contains software for accelerating ETL (DALI, RAPIDS), Training (cuDNN, NCCL), and Inference (TensorRT) workloads.
Using the TensorFlow NGC Container requires the host system to have the following installed:
For supported versions, see the Framework Containers Support Matrix and the NVIDIA Container Toolkit Documentation.
No other installation, compilation, or dependency management is required. It is not necessary to install the NVIDIA CUDA Toolkit.
To run a container, issue the appropriate command as explained in the Running A Container chapter in the NVIDIA Containers For Deep Learning Frameworks User’s Guide and specify the registry, repository, and tags. For more information about using NGC, refer to the NGC Container User Guide.
If you have Docker 19.03 or later, a typical command to launch the container is:
docker run --gpus all -it --rm nvcr.io/nvidia/tensorflow:xx.xx-tfx-py3
If you have Docker 19.02 or earlier, a typical command to launch the container is:
nvidia-docker run -it --rm nvcr.io/nvidia/tensorflow:xx.xx-tfx-py3
Where:
xx.xx
is the container version. For example, 22.01
.tfx
is the version of TensorFlow. For example, tf1
or tf2
.TensorFlow is run by importing it as a Python module:
$ python
>>> import tensorflow as tf
# If tf1
>>> print(tf.test.is_gpu_available())
True
# If tf2
>>> tf.config.list_physical_devices("GPU").__len__() > 0
True
See /workspace/README.md
inside the container for information on getting started and customizing your TensorFlow image.
You might want to pull in data and model descriptions from locations outside the container for use by TensorFlow. To accomplish this, the easiest method is to mount one or more host directories as Docker bind mounts. For example:
docker run --gpus all -it --rm -v local_dir:container_dir nvcr.io/nvidia/tensorflow:xx.xx-tfx-py3
Note: In order to share data between ranks, NCCL may require shared system memory for IPC and pinned (page-locked) system memory resources. The operating system's limits on these resources may need to be increased accordingly. Refer to your system's documentation for details. In particular, Docker containers default to limited shared and pinned memory resources. When using NCCL inside a container, it is recommended that you increase these resources by issuing:
--shm-size=1g --ulimit memlock=-1
in the docker run
command.
Jobs using the TensorFlow NGC Container on Base Command Platform clusters can be launched either by using the NGC CLI tool or by using the Base Command Platform Web UI. To use the NGC CLI tool, configure the Base Command Platform user, team, organization, and cluster information using the ngc config
command as described here.
An example command to launch the container on a single-GPU instance is:
ngc batch run --name "My-1-GPU-tensorflow-job" --instance dgxa100.80g.1.norm --commandline "sleep infinity" --result /results --image "nvidia/tensorflow:23.03-tf1-py3"
The TensorFlow container includes JupyterLab in it and can be invoked as part of the job command for easy access to the container and exploring the capabilities of the container. Once the job is created, let it run for 2-3 minutes before opening the JupyterLab. Example to invoke JupyterLab as part of the job run on a single DGX node is:
ngc batch run --name "My-1-node-tensorflow-jupyterlab-job" --instance dgxa100.80g.8.norm --commandline "jupyter lab --allow-root --ip=* --port=8888 --no-browser --NotebookApp.token='' --NotebookApp.allow_origin='*' --notebook-dir=/ & sleep infinity" --result /results --image "nvidia/tensorflow:23.03-tf1-py3" --port 8888
For the full list of contents, see the TensorFlow Container Release Notes.
This container image contains the complete source of the NVIDIA version of TensorFlow in /opt/tensorflow
. It is prebuilt and installed as a system Python module. There are two versions of the container at each release, containing TensorFlow 1 and TensorFlow 2 respectively. Visit tensorflow.org to learn more about TensorFlow.
The NVIDIA TensorFlow Container is optimized for use with NVIDIA GPUs, and contains the following software for GPU acceleration:
The software stack in this container has been validated for compatibility, and does not require any additional installation or compilation from the end user. This container can help accelerate your deep learning workflow from end to end.
NVIDIA Data Loading Library (DALI) is designed to accelerate data loading and preprocessing pipelines for deep learning applications by offloading them to the GPU. DALI primary focuses on building data preprocessing pipelines for image, video, and audio data. These pipelines are typically complex and include multiple stages, leading to bottlenecks when run on CPU. Use this container to get started on accelerating data loading with DALI.
RAPIDS is a suite of open source software libraries and APIs gives you the ability to execute end-to-end data science and analytics pipelines entirely on GPU. RAPIDS focuses on common data preparation tasks for analytics and data science. The RAPIDS API is built to mirror commonly used data processing libraries like pandas, thus providing massive speedups with minor changes to a preexisting codebase. Use this container to get started on accelerating your data science pipelines with RAPIDS.
NVIDIA CUDA Deep Neural Network Library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. The version of TensorFlow in this container is precompiled with cuDNN support, and does not require any additional configuration.
NVIDIA Collective Communications Library (NCCL) implements multi-GPU and multi-node communication primitives for NVIDIA GPUs and Networking that take into account system and network topology. NCCL is integrated with TensorFlow to accelerate training on multi-GPU and multi-node systems. In particular, NCCL provides the default all-reduce algorithm for the Mirrored and MultiWorkerMirrored distributed training strategies.
TensorRT is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. TensorFlow integration with TensorRT (TF-TRT) optimizes and executes compatible subgraphs, allowing TensorFlow to execute the remaining graph. While you can still use TensorFlow's wide and flexible feature set, TensorRT will parse the model and apply optimizations to the portions of the graph wherever possible.
For the latest Release Notes, see the TensorFlow Release Notes.
For a full list of the supported software and specific versions that come packaged with this framework based on the container image, see the Frameworks Support Matrix.
For more information about TensorFlow, including tutorials, documentation, and examples, see:
To review known CVEs on this image, refer to the Security Scanning tab on this page.
By pulling and using the container, you accept the terms and conditions of this End User License Agreement.